Parsing Log Data for Compression and Querying
Logs are descriptions of system events documented and stored in textual form. They are vital to understanding system behavior by analyzing them. The increasing com- plexity of systems leads to a larger volume of log data. This, combined with faster development cycles and the widespread distribution of systems, poses significant chal- lenges in efficiently processing and storing log data while minimizing computational and storage costs. This thesis conducts a systematic literature review investigating log parsing and compression techniques. A novel log parsing approach is proposed, enabling the detection of previously elusive tokens within log records. These tokens contribute to constructing regular expressions that aid in the log compression process. The effectiveness of the parsing approach is evaluated by comparing it with twelve other parsers using a benchmark. The evaluation reveals that the impact of parsing results on the compression ratio is minimal. Additionally, data compressed by specialized log compressors show positive characteristics not found in general–purpose compression techniques.
Titelzusatz / Titel übersetzt
Parsing von Log Data für Komprimierung und Suche
log data, log parsing, log compression, log template extraction, log mining, log query
Link zur Veröffentlichung