关于正则表达式:Logstash grok过滤器Apache模式

Logstash grok filter apache pattern

这是一个示例Apache Tomcat日志:

1
2
portal.portal.some.thing.int:8443 13.233.220.113 - - [09/Sep/2019:00:08:02 +0200]"GET /en/search-results?p_p_id=portal201_WAR_portal201_INSTANCE_q8EzsBteHybf&p_p_lifecycle=1&p_p_state=normal&queryText=Poll&facet.collection=A??Lex%2CAMsom%2CAMss%2WebPage%2SummariesOfSomething&startRow=1&resultsPerPage=10&SEARCH_TYPE=SIMPLE HTTP/1.1" 230 334734 6261 - - 35S64857F6860FDFC0F60B5B47A97E18
10.235.350.103 94.62.15.157, 10.435.230.101,10.134.046.2

我想捕获以下变量

09 /九月/ 2019:00:08:02 0200

/ en / search-results?p_p_id = portal2 ....

35S64857F6860FDFC0F60B5B47A97E18

您能帮我吗?我只想索引那些,然后删除其他索引,这可能吗?谢谢


使用此grok模式:

1
%{GREEDYDATA:field1} %{IP:ip1} - - \\[%{GREEDYDATA:date}] "%{WORD:method} %{GREEDYDATA:request}" %{WORD:numbers} %{WORD:numbers} %{WORD:numbers} - - %{WORD:last_parameter}

输入:

1
2
portal.portal.some.thing.int:8443 13.233.220.113 - - [09/Sep/2019:00:08:02 +0200]"GET /en/search-results?p_p_id=portal201_WAR_portal201_INSTANCE_q8EzsBteHybf&p_p_lifecycle=1&p_p_state=normal&queryText=Poll&facet.collection=A??Lex%2CAMsom%2CAMss%2WebPage%2SummariesOfSomething&startRow=1&resultsPerPage=10&SEARCH_TYPE=SIMPLE HTTP/1.1" 230 334734 6261 - - 35S64857F6860FDFC0F60B5B47A97E18
10.235.350.103 94.62.15.157, 10.435.230.101,10.134.046.2

输出:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
{
 "field1": [
    [
     "portal.portal.some.thing.int:8443"
    ]
  ],
 "ip1": [
    [
     "13.233.220.113"
    ]
  ],
 "IPV6": [
    [
      null
    ]
  ],
 "IPV4": [
    [
     "13.233.220.113"
    ]
  ],
 "date": [
    [
     "09/Sep/2019:00:08:02 +0200"
    ]
  ],
 "method": [
    [
     "GET"
    ]
  ],
 "request": [
    [
     "/en/search-results?p_p_id=portal201_WAR_portal201_INSTANCE_q8EzsBteHybf&p_p_lifecycle=1&p_p_state=normal&queryText=Poll&facet.collection=A??Lex%2CAMsom%2CAMss%2WebPage%2SummariesOfSomething&startRow=1&resultsPerPage=10&SEARCH_TYPE=SIMPLE HTTP/1.1"
    ]
  ],
 "numbers": [
    [
     "230",
     "334734",
     "6261"
    ]
  ],
 "last_parameter": [
    [
     "35S64857F6860FDFC0F60B5B47A97E18"
    ]
  ]
}

您想要的字段是:

  • 日期
  • 要求
  • last_parameter

您可以使用mutate过滤器中的remove字段来删除其他字段。