README.md (5716B)
1 ``` 2 _____ 3 .xiX*****Xix. 4 .X7' '4Xk, 5 dXl 'XX. . 6 xXXl XXl . 7 4XXX XX' 8 . ,x iX' _,,xxii 9 | ²| ,iX7,xiiXXXXXXXl 10 | .xi,xiXXXXXXXXXXXX: 11 . ..iXXiXXXXXXXXXXXXXXX7. 12 . .xXXXXXXXXXXXXXXX'XXXX7 . 13 | ,XXXXXXXXXXXXXXXX'XXX7' | 14 : .XXXXX7*'"' 2XXX7'XX7' | 15 __/ \ _____ ____ \XX' _____ 47' ___ ___ _____ __ 16 .\\_ \___/ _ \__/ _/_______\ _/______/ / \ \____/ _ \___/ \ _____ 17 . / __ Y _ __ \__ _________ _____ \/\/ ____ _ _ ______ \/ __/// 18 :/ / | \ |' \/ \/ \/ \/ Y \/ \ \ : 19 |\______/\_________/____| /\____ /\_____/\_____/\____|____/\____\___/ | 20 +--------------------- \____/ --- \____/ ----:----------------------h7/dS!----+ 21 . | : 22 : . : | 23 | . Logswan . | 24 | : . | 25 |_|_______________________|__| 26 | : 27 . 28 ``` 29 # Logswan 30 31 Logswan is a fast Web log analyzer using probabilistic data structures. It is 32 targeted at very large log files, typically APIs logs. It has constant memory 33 usage regardless of the log file size, and takes approximatively 4MB of RAM. 34 35 Unique visitors counting is performed using two HyperLogLog counters (one for 36 IPv4, and another one for IPv6), providing a relative accuracy of 0.10%. 37 String representations of IP addresses are used and preferred as they offer 38 better precision. 39 40 Project design goals include: speed, memory-usage efficiency, and keeping the 41 code as simple as possible. 42 43 Logswan is **opinionated software**: 44 45 - It only supports the Common Log Format, in order to keep the parsing code 46 simple. It can of course process the Combined Log Format as well (referer 47 and user agent fields will be discarded) 48 - It does not split results per day, but log files can be split prior to 49 being processed 50 - Input file size and bandwidth usage are reported in bytes, there are no 51 plans to format or round them 52 53 Logswan is written with security in mind and is running sandboxed on OpenBSD 54 (using pledge). Experimental seccomp support is available for selected 55 architectures and can be enabled by setting the `ENABLE_SECCOMP` variable 56 to `1` when invoking CMake. It has also been extensively fuzzed using AFL 57 and Honggfuzz. 58 59 ## Features 60 61 Currently implemented features: 62 63 - Counting used bandwidth 64 - Counting number of processed lines / invalid lines 65 - Counting number of hits (IPv4 and IPv6 hits) 66 - Counting visits (unique IP addresses for both IPv4 and IPv6) 67 - GeoIP lookups (for both IPv4 and IPv6) 68 - Hourly hits distribution 69 - HTTP method distribution 70 - HTTP protocol distribution 71 - HTTP status codes distribution 72 73 ## Dependencies 74 75 Logswan uses the `CMake` build system and requires `Jansson` and `libmaxminddb` 76 libraries and header files. 77 78 ## Installing dependencies 79 80 - OpenBSD: `pkg_add -r cmake jansson libmaxminddb` 81 - NetBSD: `pkgin in cmake jansson libmaxminddb` 82 - FreeBSD: `pkg install cmake jansson libmaxminddb` 83 - macOS: `brew install cmake jansson libmaxminddb` 84 - Alpine Linux: `apk add cmake gcc make musl-dev jansson-dev libmaxminddb-dev` 85 - Debian / Ubuntu: `apt-get install build-essential cmake libjansson-dev libmaxminddb-dev` 86 - Fedora: `dnf install cmake gcc make jansson-devel libmaxminddb-devel` 87 88 ## Building 89 90 mkdir build 91 cd build 92 cmake .. 93 make 94 95 Logswan has been successfully built and tested on OpenBSD, NetBSD, FreeBSD, 96 macOS, and Linux with both Clang and GCC. 97 98 ## Packages 99 100 Logswan packages are available for: 101 102 - [OpenBSD][1] 103 - [NetBSD][2] 104 - [FreeBSD][3] 105 - [Debian][4] 106 - [Ubuntu][5] 107 - [Void Linux][6] 108 - [Homebrew][7] 109 110 ### GeoIP2 databases 111 112 Logswan looks for GeoIP2 databases in `${CMAKE_INSTALL_PREFIX}/share/dbip` by 113 default, which points to `/usr/local/share/dbip`. 114 115 A custom directory can be set using the `GEOIP2DIR` variable when invoking 116 CMake: 117 118 cmake -DGEOIP2DIR=/var/db/dbip . 119 120 The free Creative Commons licensed DB-IP IP to Country Lite database can be 121 downloaded [here][8]. 122 123 Alternatively, GeoLite2 Country database from MaxMind can be downloaded free 124 of charge [here][9], but require accepting an EULA and is not freely licensed. 125 126 ## Usage 127 128 logswan [-ghv] [-d db] logfile 129 130 If file is a single dash (`-'), logswan reads from the standard input. 131 132 The options are as follows: 133 134 -d db Specify path to a GeoIP database. 135 -g Enable GeoIP lookups. 136 -h Display usage. 137 -v Display version. 138 139 Logswan outputs JSON data to **stdout**. 140 141 ## License 142 143 Logswan is released under the BSD 2-Clause license. See `LICENSE` file for 144 details. 145 146 ## Author 147 148 Logswan is developed by Frederic Cambus. 149 150 - Site: https://www.cambus.net 151 152 ## Resources 153 154 Project homepage: https://www.logswan.org 155 156 GitHub: https://github.com/fcambus/logswan 157 158 [1]: https://openports.pl/path/www/logswan 159 [2]: https://pkgsrc.se/www/logswan 160 [3]: https://www.freshports.org/www/logswan 161 [4]: https://packages.debian.org/search?keywords=logswan 162 [5]: https://packages.ubuntu.com/search?keywords=logswan 163 [6]: https://github.com/void-linux/void-packages/tree/master/srcpkgs/logswan 164 [7]: https://formulae.brew.sh/formula/logswan 165 [8]: https://db-ip.com/db/lite.php 166 [9]: https://dev.maxmind.com/geoip/geoip2/geolite2/