<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>[blog.rayfoo] &#187; Splunk</title>
	<atom:link href="http://blog.rayfoo.info/tag/splunk/feed" rel="self" type="application/rss+xml" />
	<link>http://blog.rayfoo.info</link>
	<description>Infosec, DFIR, tech geekery, thoughts and whatnot</description>
	<lastBuildDate>Wed, 25 Jan 2012 00:36:47 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Profiling of persistent SSHD brute force attack</title>
		<link>http://blog.rayfoo.info/2011/04/profiling-of-persistent-sshd-brute-force-attack</link>
		<comments>http://blog.rayfoo.info/2011/04/profiling-of-persistent-sshd-brute-force-attack#comments</comments>
		<pubDate>Sun, 03 Apr 2011 19:04:19 +0000</pubDate>
		<dc:creator>ray</dc:creator>
				<category><![CDATA[Everything]]></category>
		<category><![CDATA[brute forcing]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[log analysis]]></category>
		<category><![CDATA[log collection]]></category>
		<category><![CDATA[profiling]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[Splunk]]></category>
		<category><![CDATA[SSH]]></category>

		<guid isPermaLink="false">http://blog.rayfoo.info/?p=823</guid>
		<description><![CDATA[Proper setting up and regular monitoring of logs gives you the avenue to know what's really happening with your box sitting out there in the internets, and to anticipate when bad things are about to happen.  One of the warning signs would be that someone has been poking around your box, looking for an (easy?) [...]]]></description>
			<content:encoded><![CDATA[<p><strong><img class="alignright size-full wp-image-824" title="Brute Force" src="http://blog.rayfoo.info/wp-content/uploads/2011/04/BruteForce.jpg" alt="" width="300" height="240" /></strong></p>
<p>Proper setting up and regular monitoring of logs gives you the avenue to know what's really happening with your box sitting out there in the internets, and to anticipate when <em>bad things</em> are about to happen.  One of the warning signs would be that <em>someone</em> has been poking around your box, looking for an (easy?) way in.</p>
<p>The natural thing that would jump out at you then, is that this <em>someone</em> has been accessing your box in far higher volumes/durations, especially on services that should not be accessed by others.</p>
<p>This is one example of such accesses on a linux box: <em>SSHD brute forcing over long periods of time.</em></p>
<p><span id="more-823"></span>Note: This post is more to talk about the process of digging/profiling, rather than the actual setup processes/log sources involved.  Feel free to ping me/comment below if you wish to discuss though.</p>
<p>The first thing you may ask is: what is "persistent"?  This would be the opposite of the run-of-the-mill opportunistic attackers.  These guys tend to bang your machine for a bit, then leave you alone immediately after failing:</p>
<div id="attachment_826" class="wp-caption aligncenter" style="width: 310px"><a href="http://blog.rayfoo.info/wp-content/uploads/2011/04/01-opportunistic.png"><img class="size-medium wp-image-826" title="Opportunistic" src="http://blog.rayfoo.info/wp-content/uploads/2011/04/01-opportunistic-300x81.png" alt="" width="300" height="81" /></a><p class="wp-caption-text">Opportunistic attack: Tries and gives up.</p></div>
<p>This contrasts greatly with the persistent buggers:</p>
<div id="attachment_827" class="wp-caption aligncenter" style="width: 310px"><a href="http://blog.rayfoo.info/wp-content/uploads/2011/04/02-persistent.png"><img class="size-medium wp-image-827" title="Persistent" src="http://blog.rayfoo.info/wp-content/uploads/2011/04/02-persistent-300x83.png" alt="Persistent Bugger" width="300" height="83" /></a><p class="wp-caption-text">Whoa!</p></div>
<p>After digging around first on the IP and supposed country of origin, we want to find out what did the attacker try to do?  One of the logs (*cough*... p0f... *cough*...) feeds info on the ports that were attempted to connect to, this could be a starting point:</p>
<p style="text-align: left;">
<div id="attachment_828" class="wp-caption aligncenter" style="width: 508px"><a href="http://blog.rayfoo.info/wp-content/uploads/2011/04/03-ports-accessed.png"><img class="size-full wp-image-828 " title="Ports Accessed" src="http://blog.rayfoo.info/wp-content/uploads/2011/04/03-ports-accessed.png" alt="" width="498" height="272" /></a><p class="wp-caption-text">Mostly port 22 (SSH), only 1 for port 80 (HTTP)?</p></div>
<p style="text-align: left;">Searching for and viewing the port 80 access attempt, by itself and in relation to the other activities shows the following:</p>
<p style="text-align: center;">
<div id="attachment_832" class="wp-caption aligncenter" style="width: 492px"><a href="http://blog.rayfoo.info/wp-content/uploads/2011/04/04-port-80-access.png"><img class="size-full wp-image-832  " title="04-port-80-access" src="http://blog.rayfoo.info/wp-content/uploads/2011/04/04-port-80-access.png" alt="" width="482" height="205" /></a><p class="wp-caption-text">Pinpointing the port 80 connection</p></div>
<p style="text-align: center;">
<div id="attachment_833" class="wp-caption aligncenter" style="width: 491px"><a href="http://blog.rayfoo.info/wp-content/uploads/2011/04/05-confirming-access-profile.png"><img class="size-full wp-image-833  " title="05-confirming-access-profile" src="http://blog.rayfoo.info/wp-content/uploads/2011/04/05-confirming-access-profile.png" alt="" width="481" height="208" /></a><p class="wp-caption-text">Viewing the logs in chronological order (Splunk defaults to reverse chronological)</p></div>
<p style="text-align: left;">Viewing the logs in chronological order (Splunk defaults to reverse chronological) shows that the port 80 connection preceeded the many many many port 22 connections by 2 minutes.  What's going on here?  If <em>somebody</em> wanted to get at the SSH accounts, why not go for them straight, rather than accessing the web service only once?  Checking the web access logs might give the answer we're looking for:</p>
<p style="text-align: center;">
<div id="attachment_834" class="wp-caption aligncenter" style="width: 524px"><a href="http://blog.rayfoo.info/wp-content/uploads/2011/04/06-accessed-http-page.png"><img class="size-full wp-image-834 " title="06-accessed-http-page" src="http://blog.rayfoo.info/wp-content/uploads/2011/04/06-accessed-http-page.png" alt="" width="514" height="157" /></a><p class="wp-caption-text">So in that TCP/80 connection....NOTHING was retrieved</p></div>
<p style="text-align: left;">Accessing <em>nothing </em>in that (only) one connection makes this look like a ping of sorts, but we can't be certain.</p>
<p style="text-align: left;">The next thing is to look at what this <em>somebody</em> was doing over the past two weeks!  First we get an idea of the kinds of things that were happening:</p>
<p style="text-align: center;">
<div id="attachment_835" class="wp-caption aligncenter" style="width: 545px"><a href="http://blog.rayfoo.info/wp-content/uploads/2011/04/07a-sshd-invalid-user.png"><img class="size-full wp-image-835 " title="07a-sshd-invalid-user" src="http://blog.rayfoo.info/wp-content/uploads/2011/04/07a-sshd-invalid-user.png" alt="" width="535" height="301" /></a><p class="wp-caption-text">Mostly &quot;Attempts to login using a non-existent user&quot;, ala our dear Mr Force, Brute Force</p></div>
<div id="attachment_836" class="wp-caption aligncenter" style="width: 523px"><a href="http://blog.rayfoo.info/wp-content/uploads/2011/04/07b-ssh-scan.png"><img class="size-full wp-image-836" title="07b-ssh-scan" src="http://blog.rayfoo.info/wp-content/uploads/2011/04/07b-ssh-scan.png" alt="" width="513" height="205" /></a><p class="wp-caption-text">...and &quot;SSH scan&quot;</p></div>
<p style="text-align: left;">What do these SSH scans mean?</p>
<p style="text-align: center;">
<div id="attachment_837" class="wp-caption aligncenter" style="width: 491px"><a href="http://blog.rayfoo.info/wp-content/uploads/2011/04/08-ssh-scan-no-ident-str-received.png"><img class="size-full wp-image-837  " title="08-ssh-scan-no-ident-str-received" src="http://blog.rayfoo.info/wp-content/uploads/2011/04/08-ssh-scan-no-ident-str-received.png" alt="" width="481" height="217" /></a><p class="wp-caption-text">Just means that the SSH handshake was not properly done/completed.</p></div>
<p style="text-align: left;">Since we already know that this is a brute force attempt, judging by the frequency of the failed SSH handshakes per day we can assume for now that they're just resulting from either the connections being blocked, or just "normal" failures in the midst of thousands of attempts.  More can be done to confirm this by zooming into the times where these errors occur, but let's say we're not interested in confirming this fact for now.</p>
<p style="text-align: left;">Looking at the nature of the attack provides some clues on the tools being used too.  For that we extract some stats concerning the tool's attack:</p>
<p style="text-align: center;">
<div id="attachment_838" class="wp-caption aligncenter" style="width: 548px"><a href="http://blog.rayfoo.info/wp-content/uploads/2011/04/09-targeted-ssh-user-counts.png"><img class="size-full wp-image-838 " title="09-targeted-ssh-user-counts" src="http://blog.rayfoo.info/wp-content/uploads/2011/04/09-targeted-ssh-user-counts.png" alt="" width="538" height="313" /></a><p class="wp-caption-text">Extracting and counting targeted SSH userids show that 473 userids are attempted in a range from 1 to 21 times each</p></div>
<p style="text-align: center;">
<div id="attachment_839" class="wp-caption aligncenter" style="width: 491px"><a href="http://blog.rayfoo.info/wp-content/uploads/2011/04/11-distribution-first-occurrences-targeted-users.png"><img class="size-full wp-image-839  " title="11-distribution-first-occurrences-targeted-users" src="http://blog.rayfoo.info/wp-content/uploads/2011/04/11-distribution-first-occurrences-targeted-users.png" alt="" width="481" height="132" /></a><p class="wp-caption-text">First occurrence of each targeted userid is spread out fairly evenly throughout the time period...</p></div>
<div id="attachment_840" class="wp-caption aligncenter" style="width: 491px"><a href="http://blog.rayfoo.info/wp-content/uploads/2011/04/12-distribution-last-occurrences-targeted-users.png"><img class="size-full wp-image-840  " title="12-distribution-last-occurrences-targeted-users" src="http://blog.rayfoo.info/wp-content/uploads/2011/04/12-distribution-last-occurrences-targeted-users.png" alt="" width="481" height="126" /></a><p class="wp-caption-text">...and last occurrences of each userid being fairly even throughout too.</p></div>
<p style="text-align: left;">More stats would be needed depending on the theory you're trying to prove/disprove, but you get the picture.</p>
<p style="text-align: left;">One of the things I usually would want to see is the list of userids used to brute force.  In this case, it looks like a predominantly Japanese/Chinese wordlist/namelist being used.  Interesting.</p>
<p style="text-align: center;">
<div id="attachment_841" class="wp-caption aligncenter" style="width: 624px"><a href="http://blog.rayfoo.info/wp-content/uploads/2011/04/14-targeted-usernames.png"><img class="size-full wp-image-841 " title="14-targeted-usernames" src="http://blog.rayfoo.info/wp-content/uploads/2011/04/14-targeted-usernames.png" alt="" width="614" height="360" /></a><p class="wp-caption-text">Am I Japanese?  Am I Chinese? <img src='http://blog.rayfoo.info/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p></div>
<p>Maybe I should start blogging in other languages to see what kind of brute force wordlists turn up <img src='http://blog.rayfoo.info/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> </p>
<p>For now, in any case, <span style="color: #ff0000;"><strong>122.166.127.116 (abts-kk-static-116.127.166.122.airtelbroadband.in), I AM WATCHING YOU</strong></span>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.rayfoo.info/2011/04/profiling-of-persistent-sshd-brute-force-attack/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Splunk 4.2</title>
		<link>http://blog.rayfoo.info/2011/03/splunk-4-2</link>
		<comments>http://blog.rayfoo.info/2011/03/splunk-4-2#comments</comments>
		<pubDate>Mon, 21 Mar 2011 11:14:07 +0000</pubDate>
		<dc:creator>ray</dc:creator>
				<category><![CDATA[Everything]]></category>
		<category><![CDATA[log analysis]]></category>
		<category><![CDATA[Splunk]]></category>

		<guid isPermaLink="false">http://blog.rayfoo.info/2011/03/splunk-4-2</guid>
		<description><![CDATA[The next version of Splunk is out! Amongst the new features that Splunk's advertising, a quick glance through the new version reveals that the revamped management interface might seem to make administering it/clusters easier. Also that the search and reporting features seem to have been beefed up too! More to come after I poke around [...]]]></description>
			<content:encoded><![CDATA[<p>The next version of Splunk is out!</p>
<p>Amongst the new features that Splunk's advertising, a quick glance through the new version reveals that the revamped management interface might seem to make administering it/clusters easier. Also that the search and reporting features seem to have been beefed up too!</p>
<p>More to come after I poke around some more, and if I have the time to write something <img src='http://blog.rayfoo.info/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://blog.rayfoo.info/2011/03/splunk-4-2/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Splunking User Agent strings</title>
		<link>http://blog.rayfoo.info/2010/08/splunking-user-agent-strings</link>
		<comments>http://blog.rayfoo.info/2010/08/splunking-user-agent-strings#comments</comments>
		<pubDate>Sun, 15 Aug 2010 15:05:54 +0000</pubDate>
		<dc:creator>ray</dc:creator>
				<category><![CDATA[Everything]]></category>
		<category><![CDATA[data analysis]]></category>
		<category><![CDATA[EFF]]></category>
		<category><![CDATA[fun]]></category>
		<category><![CDATA[iPhone]]></category>
		<category><![CDATA[jailbreak]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[Splunk]]></category>
		<category><![CDATA[user agent]]></category>

		<guid isPermaLink="false">http://blog.rayfoo.info/?p=733</guid>
		<description><![CDATA[Just thought I'd do a quick survey of the kinds of users trying to hit my site, just for the fun of it, heh. Fired up Splunk to do a quick search over the past 7 days: The resulting string can be easily copied and massaged further in a text editor (replacing the "in between" [...]]]></description>
			<content:encoded><![CDATA[<p>Just thought I'd do a quick survey of the kinds of users trying to hit my site, just for the fun of it, heh.</p>
<p>Fired up <a href="http://www.splunk.com/">Splunk</a> to do a quick search over the past 7 days:</p>
<pre class="brush: plain; title: ; notranslate">index=myblogindex | dedup useragent | fields useragent | sort useragent | format</pre>
<p>The resulting string can be easily copied and massaged further in a text editor (replacing the "in between" strings like <span style="color: #33cccc;"><em>" ) OR ( useragent="</em></span> with <span style="color: #33cccc;"><em>\n</em></span>)</p>
<p>I'm pretty interested still (as always) to see how easy it is to <a href="https://www.eff.org/deeplinks/2010/01/tracking-by-user-agent">profile/"follow" an individual user due to uniqueness of each OS-browser's useragent (UA) strings</a>, but that's another story for another exercise, another day...</p>
<p>Here're some of the more interesting UA strings and analyses.  And these were harvested <em>only</em> over a span of 7 days!</p>
<blockquote><p>BlackBerry9530/5.0.0.732 Profile/MIDP-2.1 Configuration/CLDC-1.1 VendorID/105</p>
<p>SonyEricssonC905/R1FA Browser/NetFront/3.4 Profile/MIDP-2.1 Configuration/CLDC-1.1 JavaPlatform/JP-8.4.3</p>
<p>T-Mobile  Dash Mozilla/4.0 (compatible; MSIE 4.01; Windows CE; Smartphone;  320x240;) MSNBOT-MOBILE/1.1 (+http://search.msn.com/msnbot.htm)</p></blockquote>
<p>Love it when I see mobile browsers' UA strings, wonder how much further could I dig into them in the future...</p>
<blockquote><p>Flight Deck Bot 1.3 beta (http://www.flightdeckreports.com/bot)</p></blockquote>
<p>Flight Deck's a game that I recently restarted my tactics experiments with, wonder how exactly did they hit my site?  No referrers sent with the requests, but I suspect they came via Twitter.  Or was it even the same Flight Deck site?  Too lazy to dig further for now <img src='http://blog.rayfoo.info/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> </p>
<blockquote><p>Mozilla/4.0 (PSP (PlayStation Portable); 2.00)</p></blockquote>
<p>PSP...?</p>
<blockquote><p>Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; sbcydsl 3.12; YComp 5.0.0.0; YPC 3.2.0; FunWebProducts; .NET CLR 1.1.4322; ZangoToolbar 4.8.2; yplus 5.1.04b)</p></blockquote>
<p>Interesting to see how many people have installed adware/spyware like <a href="http://www.google.com/search?q=funwebproducts">FunWebProducts</a>.  There're other examples in my logs too of such malware that modify the UA string, which makes it possible to do detection and statistics in perimeter devices like IDSes...</p>
<blockquote><p>Mozilla/5.0 (iPhone; U; CPU iPhone OS 3_1_2 like Mac OS X; en-us) AppleWebKit/528.18 (KHTML, like Gecko) Mobile/7D11</p>
<p>Mozilla/5.0 (iPhone; U; CPU iPhone OS 3_1_3 like Mac OS X; en-us) AppleWebKit/528.18 (KHTML, like Gecko) Mobile/7E18</p>
<p>Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_0_1 like Mac OS X; en-us) AppleWebKit/532.9 (KHTML, like Gecko) Mobile/8A306</p>
<p>Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_0_1 like Mac OS X; en-us) AppleWebKit/532.9 (KHTML, like Gecko) Version/4.0.5 Mobile/8A306 Safari/6531.22.7</p>
<p>Mozilla/5.0 (iPod; U; CPU iPhone OS 3_1_3 like Mac OS X; en-us) AppleWebKit/528.18 (KHTML, like Gecko) Version/4.0 Mobile/7E18 Safari/528.16</p>
<p>Mozilla/5.0 (iPod; U; CPU iPhone OS 3_1_3 like Mac OS X; nl-nl) AppleWebKit/528.18 (KHTML, like Gecko) Version/4.0 Mobile/7E18 Safari/528.16</p></blockquote>
<p>iPhones/iPods/iWhatNot.  OS AND browser versions all revealed!  Now, how about some "automatic" "<a href="http://www.symantec.com/connect/blogs/beware-attackers-could-use-new-iphone-4-jailbreak-code-carry-out-malicious-attacks">jailbreaking</a>"? Heh heh heh...not!</p>
<blockquote><p>SAMSUNG-SGH-E250/1.0 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Browser/6.2.3.3.c.1.101 (GUI) MMP/2.0 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)</p></blockquote>
<p>Googlebot using SAMSUNG phones?!  Either Google has some wicked architecture to incorporate mobile phones as crawlers, or that this is a very confused bot <img src='http://blog.rayfoo.info/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<blockquote><p>Wget/1.12 (linux-gnu)</p>
<p>Wget/1.9+cvs-stable (Red Hat modified)</p>
<p>curl/7.18.2 (i386-pc-win32) libcurl/7.18.2 zlib/1.2.3</p>
<p>curl/7.19.6 (i386-pc-win32) libcurl/7.19.6 OpenSSL/0.9.8k zlib/1.2.3</p></blockquote>
<p>When you see your site being accessed by programs like wget and curl, and it's not Amazon's AWS (use Splunk's lookup dnslookup clientip to find out the clienthost name), it's a very safe bet that they're zombies/compromised user computers as part of a botnet.  The clienthost names and many different IP addresses would confirm that they're zombies.</p>
<p>Well, that's all for today folks!  Feel free to comment/discuss below <img src='http://blog.rayfoo.info/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://blog.rayfoo.info/2010/08/splunking-user-agent-strings/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Profiling client internet connections</title>
		<link>http://blog.rayfoo.info/2010/07/profiling-client-internet-connections</link>
		<comments>http://blog.rayfoo.info/2010/07/profiling-client-internet-connections#comments</comments>
		<pubDate>Thu, 08 Jul 2010 10:20:57 +0000</pubDate>
		<dc:creator>ray</dc:creator>
				<category><![CDATA[Everything]]></category>
		<category><![CDATA[data mining]]></category>
		<category><![CDATA[data visualization]]></category>
		<category><![CDATA[information gathering]]></category>
		<category><![CDATA[log analysis]]></category>
		<category><![CDATA[p0f]]></category>
		<category><![CDATA[Splunk]]></category>

		<guid isPermaLink="false">http://blog.rayfoo.info/?p=628</guid>
		<description><![CDATA[Some more fun with p0f and Splunk...Now with profiling of client internet connections! Setup of the p0f and logging is the same as in the OS Profiling post. The Splunk search string has been extended to extract the source's internet link as a field too (go for the portion in bold for the field extracting [...]]]></description>
			<content:encoded><![CDATA[<p>Some more fun with p0f and Splunk...Now with profiling of client internet connections!</p>
<p>Setup of the p0f and logging is the same as in the <a href="http://blog.rayfoo.info/2010/07/os-profiling">OS Profiling</a> post.</p>
<p>The Splunk search string has been extended to extract the source's internet link as a field too (go for the portion in <strong>bold</strong> for the field extracting commands):</p>
<p><span style="color: #339966;">| file /home/path/to/p0f.log | <strong>rex field=_raw "&gt; (?&lt;srcip&gt;[^:]+):(?&lt;srcport&gt;[^ ]+) – (?&lt;srcos&gt;.+?) \(" | rex field=_raw "-&gt; (?&lt;dstip&gt;[^:]+):(?&lt;dstport&gt;[^ ]+) " | rex field=_raw "link: (?&lt;srclink&gt;.*)\)$"</strong> |  regex srclink!="(unspecified|unknown)" | top limit=0 srclink</span></p>
<p>The fields that I extract with this:</p>
<ul>
<li>srcip -&gt; source IP</li>
<li>srcport -&gt; source TCP port</li>
<li>srcos -&gt; source's OS (woot!)</li>
<li>dstip -&gt; destination IP (which is my machine's)</li>
<li>dstport -&gt; the destination port which the TCP connection was initiated to</li>
<li>srclink -&gt; the internet link of the source machine</li>
</ul>
<p>After filtering out the "unspecified" and "unknown" links, the list of the detected links are as follows:</p>
<p style="text-align: center;"><a href="http://blog.rayfoo.info/wp-content/uploads/2010/07/p0fsplunk-connectionlink.png"><img class="size-full wp-image-629 aligncenter" title="p0fsplunk-connectionlink" src="http://blog.rayfoo.info/wp-content/uploads/2010/07/p0fsplunk-connectionlink.png" alt="" width="600" height="310" /></a></p>
<p style="text-align: left;">"ethernet/modem" points to mostly cable connections.  There're some interesting entries in the list though, like <a href="http://en.wikipedia.org/wiki/VTun">vtun</a>, <a href="http://en.wikipedia.org/wiki/Point-to-Point_Protocol_over_Ethernet">pppoe</a>, Google/AOL, <a href="http://en.wikipedia.org/wiki/IP_tunnel">IPv6</a>/<a href="http://www.linuxfoundation.org/collaborate/workgroups/networking/tunneling">IPIP</a> (early adopters? haha).  Don't have any idea on what's IPSec/GRE, or vLAN here in this context though.</p>
<p style="text-align: left;">Just for the heck of it, here's the chart for this table, generated from the reports link in Splunk.</p>
<p style="text-align: left;"><a href="http://blog.rayfoo.info/wp-content/uploads/2010/07/p0fsplunk-connectionchart.png"><img class="aligncenter size-full wp-image-630" title="p0fsplunk-connectionchart" src="http://blog.rayfoo.info/wp-content/uploads/2010/07/p0fsplunk-connectionchart.png" alt="" width="600" height="377" /></a></p>
<p style="text-align: left;">I like the charts, because they allow some interaction with the charts for simple datasets, but I digress <img src='http://blog.rayfoo.info/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> </p>
<p style="text-align: center;"><a href="http://blog.rayfoo.info/wp-content/uploads/2010/07/p0fsplunk-connectionchartmouseover.png"><img class="aligncenter size-full wp-image-631" title="p0fsplunk-connectionchartmouseover" src="http://blog.rayfoo.info/wp-content/uploads/2010/07/p0fsplunk-connectionchartmouseover.png" alt="" width="600" height="369" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.rayfoo.info/2010/07/profiling-client-internet-connections/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>OS Profiling</title>
		<link>http://blog.rayfoo.info/2010/07/os-profiling</link>
		<comments>http://blog.rayfoo.info/2010/07/os-profiling#comments</comments>
		<pubDate>Tue, 06 Jul 2010 16:00:24 +0000</pubDate>
		<dc:creator>ray</dc:creator>
				<category><![CDATA[Everything]]></category>
		<category><![CDATA[data mining]]></category>
		<category><![CDATA[information gathering]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[log analysis]]></category>
		<category><![CDATA[p0f]]></category>
		<category><![CDATA[Splunk]]></category>
		<category><![CDATA[tee]]></category>

		<guid isPermaLink="false">http://blog.rayfoo.info/?p=605</guid>
		<description><![CDATA[Trying out p0f along with Splunk.. p0f allows you to determine the OS of the remote machine based on the TCP fields characteristics.  It can also tell whether the machine is behind a firewall, what kind of internet connection it is running from...pretty useful for information junkies like me Here's what I did: ./p0f -t [...]]]></description>
			<content:encoded><![CDATA[<p>Trying out <a href="http://lcamtuf.coredump.cx/p0f.shtml">p0f</a> along with <a href="http://www.splunk.com/download">Splunk</a>..</p>
<p>p0f allows you to determine the OS of the remote machine based on the TCP fields characteristics.  It can also tell whether the machine is behind a firewall, what kind of internet connection it is running from...pretty useful for information junkies like me <img src='http://blog.rayfoo.info/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> </p>
<p>Here's what I did:</p>
<p><span style="color: #339966;">./p0f -t -u MyUseridHere -i eth0 'src not MyIPAddressHere' | tee -a p0f.log</span></p>
<p>Runs p0f, logging with actual timestamps (-t), chroot and setuid to MyUserIdHere (-u), listening on eth0 (-i), and filtering out packets for connections initiated from my machine itself (since I'm not interested in profiling my own machine).</p>
<p><a href="http://en.wikipedia.org/wiki/Tee_(command)">tee</a> is a (really nifty!) linux command.  What it does is to "split" the input (stdin) to two parts: stdout and the file specified.  The -a option tells it to append to the file instead of overwriting it.</p>
<p>Using this, p0f outputs logs like this one:</p>
<p><span style="color: #339966;">&lt;Sat Jul  3 07:03:56 2010&gt; 175.40.12.47:1095 - Windows 2000 SP2+, XP SP1+ (seldom 98)<br />
-&gt; 74.207.229.183:80 (distance 12, link: sometimes DSL (2))</span></p>
<p>One of the Splunk queries that I poked around with:</p>
<p><span style="color: #339966;">| file /path/to/p0f.log | rex field=_raw "&gt; (?&lt;srcip&gt;[^:]+):(?&lt;srcport&gt;[^ ]+) - (?&lt;srcos&gt;.+?) \(" | rex field=_raw "-&gt; (?&lt;dstip&gt;[^:]+):(?&lt;dstport&gt;[^ ]+) " | regex srcos!="UNKNOWN" | top limit=0 srcos</span></p>
<p>This query extracts out the source and destination IP and port, and the source OS.  Then after filtering out the OS tagged with UNKNOWN, the remaining entries are ranked...</p>
<p>The resulting chart, of not much real interest by itself, just shows that other than that the connections are predominantly from linux machines (hurhur), and there's a connection from a really old Netware machine (<a href="http://en.wikipedia.org/wiki/Novell_NetWare#NetWare_5.x">5 was released in Oct 1998!</a>).</p>
<p style="text-align: center;"><a href="http://blog.rayfoo.info/wp-content/uploads/2010/07/p0fsplunk.png"><img class="aligncenter size-full wp-image-606" title="p0fsplunk" src="http://blog.rayfoo.info/wp-content/uploads/2010/07/p0fsplunk.png" alt="" width="480" height="250" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.rayfoo.info/2010/07/os-profiling/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Visualizing sshd brute-force attempts (part 2)</title>
		<link>http://blog.rayfoo.info/2010/06/visualizing-sshd-brute-force-attempts-part-2</link>
		<comments>http://blog.rayfoo.info/2010/06/visualizing-sshd-brute-force-attempts-part-2#comments</comments>
		<pubDate>Wed, 02 Jun 2010 16:42:57 +0000</pubDate>
		<dc:creator>ray</dc:creator>
				<category><![CDATA[Everything]]></category>
		<category><![CDATA[afterglow]]></category>
		<category><![CDATA[brute forcing]]></category>
		<category><![CDATA[data visualization]]></category>
		<category><![CDATA[graphviz]]></category>
		<category><![CDATA[log analysis]]></category>
		<category><![CDATA[sed]]></category>
		<category><![CDATA[Splunk]]></category>
		<category><![CDATA[SSH]]></category>

		<guid isPermaLink="false">http://blog.rayfoo.info/?p=581</guid>
		<description><![CDATA[It's always better to Read The Fine Manual (or run perl afterglow.pl -h for the more updated helpfile)...though it's not really that well documented  Afterglow allows for two column inputs, rather than us having to do weird tricks to make them 3-column. (Note to self: get the raw data with fields in the order that [...]]]></description>
			<content:encoded><![CDATA[<p>It's always better to Read The Fine <a href="http://afterglow.sourceforge.net/manual.html#6">Manual</a> (or run <span style="color: #339966;">perl afterglow.pl -h</span> for the more updated helpfile)...though it's not really <em>that</em> well documented <img src='http://blog.rayfoo.info/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' />   Afterglow allows for two column inputs, rather than us having to do weird tricks to make them 3-column.</p>
<p>(Note to self: get the raw data with fields in the order that you want where possible/faster, rather than pumping it through <span style="color: #339966;">sed</span>.  Makes for good practice though.)</p>
<p>Using the csv file containing userids (visualized in yellow) and IPs (visualized in green) over the past few months from Splunk, here're the results of some of the experiments.</p>
<p>Oh, for the Windows users, you can use <span style="color: #339966;">type</span> instead of <span style="color: #339966;">cat</span> <img src='http://blog.rayfoo.info/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p>First test using <a href="http://www.graphviz.org/About.php">GraphViz's</a> neato to layout:</p>
<p style="text-align: center;"><span style="color: #339966;">perl afterglow.pl -b 1 -i &lt;infile&gt; -c color.properties -t | neato -Tgif -o output.gif</span></p>
<div class="wp-caption aligncenter" style="width: 410px"><a href="http://lh4.ggpht.com/_evPUEWAwFrY/TAaB9H39-rI/AAAAAAAAI_E/bjhxhWE5vUc/test-neato.png"><img class="    " title="test afterglow neato" src="http://lh4.ggpht.com/_evPUEWAwFrY/TAaB9H39-rI/AAAAAAAAI_E/bjhxhWE5vUc/s400/test-neato.png" alt="" width="400" height="356" /></a><p class="wp-caption-text">Huge, but better visualized with -e 5 option (Resulting image for that is too huge to upload though <img src='http://blog.rayfoo.info/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> ).  Note the single IP in the middle (the yellow explosion) that had been trying a LOT of userids to date.</p></div>
<p>Second test using fdp:</p>
<p style="text-align: center;"><span style="color: #339966;">perl afterglow.pl -b 1 -i &lt;infile&gt; -c color.properties -t | fdp -Tgif -o output.gif</span></p>
<div class="wp-caption aligncenter" style="width: 226px"><a href="http://lh6.ggpht.com/_evPUEWAwFrY/TAaCCQCGs8I/AAAAAAAAI_I/Sogy7NxglyE/test-fdp.png"><img title="test afterglow fdp" src="http://lh6.ggpht.com/_evPUEWAwFrY/TAaCCQCGs8I/AAAAAAAAI_I/Sogy7NxglyE/s400/test-fdp.png" alt="" width="216" height="400" /></a><p class="wp-caption-text">fdp doesn&#39;t seem to be well suited for this</p></div>
<p>Third test using sfdp:</p>
<p>No command here, you should have noticed the pattern from the first two...</p>
<div class="wp-caption aligncenter" style="width: 410px"><a href="http://lh5.ggpht.com/_evPUEWAwFrY/TAaCESgte6I/AAAAAAAAI_M/Z-jVk3Xf3AE/test-sfdp.png"><img title="test afterglow sfdp" src="http://lh5.ggpht.com/_evPUEWAwFrY/TAaCESgte6I/AAAAAAAAI_M/Z-jVk3Xf3AE/s400/test-sfdp.png" alt="" width="400" height="394" /></a><p class="wp-caption-text">_even_ less suited for this type of data...</p></div>
<p>Last test using twopi:</p>
<p>According to the <a href="http://www.graphviz.org/About.php">GraphViz</a> site, twopi's more suited for visualizing stuff like telecommunications flows.</p>
<div class="wp-caption aligncenter" style="width: 386px"><a href="http://lh4.ggpht.com/_evPUEWAwFrY/TAaCFUsQLcI/AAAAAAAAI_Q/9Y9wHwDpzrI/test-twopi.png"><img title="test afterglow twopi" src="http://lh4.ggpht.com/_evPUEWAwFrY/TAaCFUsQLcI/AAAAAAAAI_Q/9Y9wHwDpzrI/s400/test-twopi.png" alt="" width="376" height="400" /></a><p class="wp-caption-text">twopi</p></div>
]]></content:encoded>
			<wfw:commentRss>http://blog.rayfoo.info/2010/06/visualizing-sshd-brute-force-attempts-part-2/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Visualizing sshd brute-force attempts</title>
		<link>http://blog.rayfoo.info/2010/05/visualizing-sshd-brute-force-attempts</link>
		<comments>http://blog.rayfoo.info/2010/05/visualizing-sshd-brute-force-attempts#comments</comments>
		<pubDate>Sun, 30 May 2010 17:25:27 +0000</pubDate>
		<dc:creator>ray</dc:creator>
				<category><![CDATA[Everything]]></category>
		<category><![CDATA[data visualization]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[sed]]></category>
		<category><![CDATA[Splunk]]></category>

		<guid isPermaLink="false">http://blog.rayfoo.info/?p=575</guid>
		<description><![CDATA[Trying out with some interesting results... 1.--- This one is a Splunk query, run over the span of the last 7 days: sourcetype="ossec_alerts" rule_number="5710"&#124; rex field=_raw "Invalid user (?&#60;userid&#62;[^ ]+) from"&#124; fields + src_ip,userid&#124;fields - _*&#124; dedup src_ip userid&#124; outputcsv ssh-atk-attempts-userid-ip 2.--- Then some data massaging on the csv file... [edit: this is not needed...just [...]]]></description>
			<content:encoded><![CDATA[<p>Trying out with some interesting results...</p>
<h2>1.---</h2>
<p>This one is a Splunk query, run over the span of the last 7 days:</p>
<pre>sourcetype="ossec_alerts" rule_number="5710"|</pre>
<pre>rex field=_raw "Invalid user (?&lt;userid&gt;[^ ]+) from"|</pre>
<pre>fields + src_ip,userid|fields - _*|</pre>
<pre>dedup src_ip userid|</pre>
<pre>outputcsv ssh-atk-attempts-userid-ip</pre>
<h2>2.---</h2>
<p>Then some data massaging on the csv file...</p>
<p>[edit: this is not needed...just output the csv file with the fields in the order you want...and read the next <a href="http://blog.rayfoo.info/2010/06/visualizing-sshd-brute-force-attempts-part-2">post</a> for better options with 2-column csv inputs]</p>
<pre>cat ssh-atk-attempts-userid-ip.csv | \</pre>
<pre>sed 's/^.*$/&amp;,server/' &gt; ssh-atk-attempts-userid-ip2.csv</pre>
<h2>3.---</h2>
<p>Then running it thru Afterglow and GraphViz's neato...</p>
<pre>cat ssh-atk-attempts-userid-ip2.csv | \</pre>
<pre>./afterglow.pl | neato -Tgif -o ssh-atk-ip-userid.gif</pre>
<p><a href="http://blog.rayfoo.info/wp-content/uploads/2010/05/ssh-atk-ip-userid.gif"><img class="aligncenter size-medium wp-image-577" title="ssh-atk-ip-userid" src="http://blog.rayfoo.info/wp-content/uploads/2010/05/ssh-atk-ip-userid-300x284.gif" alt="" width="300" height="284" /></a></p>
<p>Seems like very little overlap in the userids that were attempted (with the exception of the few favourites like root, guest, test).  A coordinated/distributed attack perhaps?  Haven't dug more into the IPs in question, but I'm pretty sure that they'd be broadband addresses, meaning that they are bots.</p>
<p>Of course we could try with a larger timespan, but the result isn't really readable... The resulting 1MB file (1813 x 1704 px) for over <em>all time</em> in Splunk only looks pretty, and not readable.</p>
<p>[edit: there're better results in the next <a href="http://blog.rayfoo.info/2010/06/visualizing-sshd-brute-force-attempts-part-2">post</a>!]</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.rayfoo.info/2010/05/visualizing-sshd-brute-force-attempts/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Getting additional (IP/network/location) info along with your Splunk searches</title>
		<link>http://blog.rayfoo.info/2010/04/getting-additional-ipnetworklocation-info-along-with-your-splunk-searches</link>
		<comments>http://blog.rayfoo.info/2010/04/getting-additional-ipnetworklocation-info-along-with-your-splunk-searches#comments</comments>
		<pubDate>Mon, 19 Apr 2010 17:57:07 +0000</pubDate>
		<dc:creator>ray</dc:creator>
				<category><![CDATA[Everything]]></category>
		<category><![CDATA[commands]]></category>
		<category><![CDATA[geolocation]]></category>
		<category><![CDATA[HOWTO]]></category>
		<category><![CDATA[log analysis]]></category>
		<category><![CDATA[network]]></category>
		<category><![CDATA[Splunk]]></category>
		<category><![CDATA[tools]]></category>

		<guid isPermaLink="false">http://blog.rayfoo.info/?p=529</guid>
		<description><![CDATA[Chanced upon some of the info by accident (smack at the bottom of one part of the Splunk documentation...), but I can't find it now.  Going to share here anyway Some (or probably most/all) of your searches might involve public IP addresses, and more often than not we would want to have additional info along [...]]]></description>
			<content:encoded><![CDATA[<p>Chanced upon some of the info by accident (smack at the bottom of one part of the <a href="http://www.splunk.com/">Splunk</a> <a href="http://www.splunk.com/base/Documentation">documentation</a>...), but I can't find it now.  Going to share here anyway <img src='http://blog.rayfoo.info/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> </p>
<p>Some (or probably most/all) of your searches might involve public IP addresses, and more often than not we would want to have additional info along with the IP address to work with.</p>
<p>Three of the things that we could do in Splunk automatically would be to get IP-location info, or to reverse lookup an IP to a domain, or to lookup a domain to an IP.</p>
<p><span id="more-529"></span></p>
<h1>1. Geolocation</h1>
<p>There're two ways to do geolocating of IPs: using the iplocation command, or to use the MAXMIND app.</p>
<h2>1a. iplocation</h2>
<p>The command iplocation is described as:</p>
<blockquote><p>Finds ips in _raw and looks up the IP location using the hostip.info database. IPs are extracted as ip1, ip2, etc. Cities and Countries are likewise extracted.</p></blockquote>
<p>What we only need to do is to pipe the search to iplocation and let it do the rest!  The lookups are done from the server on the fly, so make sure that the server is able to do whois/ns lookups on the network.</p>
<p style="text-align: center;"><span style="color: #339966;">index=myindex | iplocation</span></p>
<p><a href="http://blog.rayfoo.info/wp-content/uploads/2010/04/splunk-iplocation.png"><img class="aligncenter size-medium wp-image-530" title="splunk iplocation" src="http://blog.rayfoo.info/wp-content/uploads/2010/04/splunk-iplocation-300x138.png" alt="" width="300" height="138" /></a></p>
<h2>1b. MAXMIND app</h2>
<p>Like previously mentioned before: install the <a href="http://www.splunkbase.com/apps/All/4.x/Add-On/app:Geo+Location+Lookup+Script">MAXMIND app</a>, then pipe the field containing IPs to the lookup (the field name <em>must</em> be clientip, if not this will not work duh)</p>
<p>This can work with the server not having any internet connectivity, but the accuracy is entirely dependant on the cached MAXMIND database.</p>
<p style="text-align: center;"><span style="color: #339966;">index=myindex | lookup geoip clientip</span></p>
<p style="text-align: center;">or</p>
<p style="text-align: center;"><span style="color: #339966;">index=myindex2 | lookup geoip clientip as fieldwithip</span></p>
<p><a href="http://blog.rayfoo.info/wp-content/uploads/2010/04/splunk-geoiplookup.png"><img class="aligncenter size-medium wp-image-531" title="splunk geoiplookup" src="http://blog.rayfoo.info/wp-content/uploads/2010/04/splunk-geoiplookup-300x137.png" alt="" width="300" height="137" /></a></p>
<h2>2, 3. IP-hostname or hostname-IP</h2>
<p>These two items are pretty similar.  Spunk 4 comes with a lookup script called external_lookup.py, and the config is already in the default transforms.conf.  So we only need to use it!</p>
<p style="text-align: center;">Resolving IPs to hostnames:</p>
<p style="text-align: center;"><span style="color: #339966;">index=myindex | lookup dnslookup clientip</span></p>
<p><a href="http://blog.rayfoo.info/wp-content/uploads/2010/04/splunk-ip-to-hostname.png"><img class="aligncenter size-medium wp-image-532" title="splunk ip to hostname" src="http://blog.rayfoo.info/wp-content/uploads/2010/04/splunk-ip-to-hostname-300x136.png" alt="" width="300" height="136" /></a></p>
<p style="text-align: center;">Resolving hostnames to IPs:</p>
<p style="text-align: center;"><span style="color: #339966;">index=myindex | lookup dnslookup clienthost</span></p>
<p style="text-align: center;">(no screenshot, sorry <img src='http://blog.rayfoo.info/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> )</p>
<p style="text-align: center;"><span style="font-family: Consolas, Monaco, 'Courier New', Courier, monospace; color: #339966;"><span style="line-height: 18px; white-space: pre; font-size: small;"><br />
</span></span></p>
<p>Leave a comment if this helped, or if you want to ask anything!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.rayfoo.info/2010/04/getting-additional-ipnetworklocation-info-along-with-your-splunk-searches/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Fun with Splunk: SSHD</title>
		<link>http://blog.rayfoo.info/2010/03/fun-with-splunk-sshd</link>
		<comments>http://blog.rayfoo.info/2010/03/fun-with-splunk-sshd#comments</comments>
		<pubDate>Sat, 13 Mar 2010 11:02:20 +0000</pubDate>
		<dc:creator>ray</dc:creator>
				<category><![CDATA[Everything]]></category>
		<category><![CDATA[brute forcing]]></category>
		<category><![CDATA[geolocation]]></category>
		<category><![CDATA[log analysis]]></category>
		<category><![CDATA[MaxMind]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[Splunk]]></category>
		<category><![CDATA[SSH]]></category>
		<category><![CDATA[tutorials]]></category>

		<guid isPermaLink="false">http://blog.rayfoo.info/?p=489</guid>
		<description><![CDATA[Thought I'd share a bit on the tip of the iceberg, on what can be done with Splunk.  Linux command line tools are still much needed for raw log analysis (since we can't have the luxury of having a Splunk installation around and ready whenever we need it), but if setup and running properly, Splunk [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-full wp-image-498" title="splunk search" src="http://blog.rayfoo.info/wp-content/uploads/2010/03/splunk-search1.png" alt="" width="164" height="38" />Thought I'd share a bit on the tip of the iceberg, on what can be done with Splunk.  Linux command line tools are still much needed for raw log analysis (since we can't have the luxury of having a Splunk installation around and ready whenever we need it), but if setup and running properly, Splunk can be pretty helpful (and not to mention faster) for some things.</p>
<p>(This post is pretty unpolished, partly because I can't be bothered to fiddle around with fitting the search strings into the width of the post, etc.  Nonetheless,  comments/discussions are always welcome heh)</p>
<p>One of my favourite tasks with log analysis is to get information on those people/bots which are brute forcing SSHD, so let's start with SSH attacks as an example.<span id="more-489"></span></p>
<h2>Prerequisites</h2>
<p>Before we start off, we'll need Splunk setup to be monitoring the appropriate logfiles.  I configured and run the <a href="http://www.splunkbase.com/apps/All/4.x/app:Splunk+for+OSSEC+(Splunk+v4+version)">OSSEC</a> and <a href="http://www.splunkbase.com/apps/All/4.x/app:Splunk+for+Unix+and+Linux">Linux</a> apps for Splunk, so that the data inputs are taken care of for me.  If you don't want to run these apps, just make sure you index the /var/log and OSSEC alert logs locations.  If you want to do the geolocation stuff the the <a href="http://www.splunkbase.com/apps/All/4.x/Add-On/app:Geo+Location+Lookup+Script">MaxMind</a> app for Splunk would be needed too.</p>
<h2>List of SSH attacks</h2>
<p>Let's start off with a simple query to see the list of previous SSH attacks:</p>
<pre style="text-align: center;"><span style="color: #00ff00;">source=*auth* sshd invalid user from</span></pre>
<p>Using this search string with the needed time range set shows a pretty graph of how many attacks we've got over time, along with the list of log entries for the attack.</p>
<div id="attachment_499" class="wp-caption aligncenter" style="width: 310px"><a href="http://blog.rayfoo.info/wp-content/uploads/2010/03/splunk-sshd-1.png"><img class="size-medium wp-image-499" title="splunk listing of sshd attacks" src="http://blog.rayfoo.info/wp-content/uploads/2010/03/splunk-sshd-1-300x149.png" alt="" width="300" height="149" /></a><p class="wp-caption-text">Click to enlarge</p></div>
<p>Seems that the attacks everyday are few, probably due to OSSEC's active responses.  A quick search would confirm that OSSEC is blocking the offending hosts.</p>
<pre style="text-align: center;"><span style="color: #00ff00;">sourcetype="ossec_alerts" </span></pre>
<pre style="text-align: center;"><span style="color: #00ff00;">action="SSHD brute force trying to get access to the system."</span></pre>
<div id="attachment_507" class="wp-caption aligncenter" style="width: 310px"><a href="http://blog.rayfoo.info/wp-content/uploads/2010/03/splunk-sshd-2.png"><img class="size-medium wp-image-507" title="splunk ossec active responses" src="http://blog.rayfoo.info/wp-content/uploads/2010/03/splunk-sshd-2-300x146.png" alt="" width="300" height="146" /></a><p class="wp-caption-text">Click to enlarge</p></div>
<h2>Drilling Down</h2>
<p>Now we know that the attacks were especially active on the 22nd Feb, and OSSEC was responding correctly by blocking them off.  Why the large numbers then?  Was it because the attacks were from different IP addresses, or that that IP address was particularly persistent that day?  We could find out by getting more information on the src_ips for the time range in question.  First we click on the bar for the 22nd Feb, then the src_ip field in the sidebar.</p>
<div id="attachment_509" class="wp-caption aligncenter" style="width: 310px"><a href="http://blog.rayfoo.info/wp-content/uploads/2010/03/splunk-sshd-31.png"><img class="size-medium wp-image-509" title="splunk ssh ossec src ips" src="http://blog.rayfoo.info/wp-content/uploads/2010/03/splunk-sshd-31-300x187.png" alt="" width="300" height="187" /></a><p class="wp-caption-text">Click to enlarge</p></div>
<p>With the time range fixed onto what we're interested in looking at, and the src_ip field showing the unique source IPs that were blocked, the results show that it was most likely a persistent attack by these two IPs.  A quick check with the auth logs tell the same story:</p>
<div id="attachment_510" class="wp-caption aligncenter" style="width: 310px"><a href="http://blog.rayfoo.info/wp-content/uploads/2010/03/splunk-sshd-4.png"><img class="size-medium wp-image-510" title="splunk sshd brute force src ips" src="http://blog.rayfoo.info/wp-content/uploads/2010/03/splunk-sshd-4-300x236.png" alt="" width="300" height="236" /></a><p class="wp-caption-text">Click to enlarge</p></div>
<h2>GeoIP Lookups</h2>
<p>Now that we know which two IPs were actively poking around, let's map them to a location.  The MaxMind app for Splunk helps nicely for this task.</p>
<pre style="text-align: center;"><span style="color: #00ff00;">source=*auth* sshd invalid user from | </span></pre>
<pre style="text-align: center;"><span style="color: #00ff00;">lookup geoip clientip as src_ip</span></pre>
<div id="attachment_511" class="wp-caption aligncenter" style="width: 310px"><a href="http://blog.rayfoo.info/wp-content/uploads/2010/03/splunk-sshd-5.png"><img class="size-medium wp-image-511" title="splunk srcip geoiplookup" src="http://blog.rayfoo.info/wp-content/uploads/2010/03/splunk-sshd-5-300x210.png" alt="" width="300" height="210" /></a><p class="wp-caption-text">Click to enlarge</p></div>
<p>The app and local geoip database does the lookups for us nicely, mapping to geolocation information like country, city, latitude, longtidue and region.  The country information is available for most/all at least, the rest would be put in if available it seems.</p>
<h2>List/Count of attacked userids for SSH</h2>
<p>The strings for searching for this depends on your SSHD config, but for me searching for the invalid users is enough.</p>
<pre style="text-align: center;"><span style="color: #00ff00;">source=*auth* sshd invalid user from | </span></pre>
<pre style="text-align: center;"><span style="color: #00ff00;">rex field=_raw "Invalid user (?&lt;atk_user_id&gt;\S+) from "</span></pre>
<p>Searching/sorting by the atk_user_id field would show us the attacked userids.  Click on the "Events Table" button to show the table of results with only the fields that you've selected.</p>
<div id="attachment_512" class="wp-caption aligncenter" style="width: 310px"><a href="http://blog.rayfoo.info/wp-content/uploads/2010/03/splunk-sshd-6.png"><img class="size-medium wp-image-512" title="splunk searching for attacked sshd userids" src="http://blog.rayfoo.info/wp-content/uploads/2010/03/splunk-sshd-6-300x153.png" alt="" width="300" height="153" /></a><p class="wp-caption-text">Click to enlarge</p></div>
<p>If we want a sorted list of the top attacked userids, pipe the search string to a top command.</p>
<pre style="text-align: center;"><span style="color: #00ff00;">source=*auth* sshd invalid user from | rex field=_raw </span></pre>
<pre style="text-align: center;"><span style="color: #00ff00;">"Invalid user (?&lt;atk_user_id&gt;\S+) from "</span></pre>
<pre style="text-align: center;"><span style="color: #00ff00;"> | top atk_user_id limit=1000</span></pre>
<p>The Results Table should show automatically for this search.</p>
<div id="attachment_513" class="wp-caption aligncenter" style="width: 310px"><a href="http://blog.rayfoo.info/wp-content/uploads/2010/03/splunk-sshd-7.png"><img class="size-medium wp-image-513" title="slunk sshd userids brute forced" src="http://blog.rayfoo.info/wp-content/uploads/2010/03/splunk-sshd-7-300x241.png" alt="" width="300" height="241" /></a><p class="wp-caption-text">Click to enlarge</p></div>
<p>Maybe we'd like an alphabetical list instead, so we just pipe the search to a sort command:</p>
<pre style="text-align: center;"><span style="color: #00ff00;">source=*auth* sshd invalid user from | rex field=_raw </span></pre>
<pre style="text-align: center;"><span style="color: #00ff00;">"Invalid user (?&lt;atk_user_id&gt;\S+) from "</span></pre>
<pre style="text-align: center;"><span style="color: #00ff00;"> | top atk_user_id limit=1000 | sort atk_user_id</span></pre>
<div id="attachment_518" class="wp-caption aligncenter" style="width: 310px"><a href="http://blog.rayfoo.info/wp-content/uploads/2010/03/splunk-sshd-8.png"><img class="size-medium wp-image-518" title="splunk sshd userids alphabetical sort" src="http://blog.rayfoo.info/wp-content/uploads/2010/03/splunk-sshd-8-300x221.png" alt="" width="300" height="221" /></a><p class="wp-caption-text">Click to enlarge</p></div>
<p>Alright, that's all for now <img src='http://blog.rayfoo.info/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://blog.rayfoo.info/2010/03/fun-with-splunk-sshd/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Troubleshooting Splunk</title>
		<link>http://blog.rayfoo.info/2010/03/troubleshooting-splunk</link>
		<comments>http://blog.rayfoo.info/2010/03/troubleshooting-splunk#comments</comments>
		<pubDate>Mon, 08 Mar 2010 14:27:51 +0000</pubDate>
		<dc:creator>ray</dc:creator>
				<category><![CDATA[Everything]]></category>
		<category><![CDATA[data mining]]></category>
		<category><![CDATA[log analysis]]></category>
		<category><![CDATA[log collection]]></category>
		<category><![CDATA[logs]]></category>
		<category><![CDATA[Splunk]]></category>
		<category><![CDATA[tools]]></category>
		<category><![CDATA[troubleshooting]]></category>

		<guid isPermaLink="false">http://blog.rayfoo.info/?p=478</guid>
		<description><![CDATA[Have been fiddling around with Splunk lately.  Splunk's a really good tool to use for log collection and analysis (and that's oversimplifying it, I believe it can even do event correlation...), which really made my love for data mining go crazy of late:P  Best part is that it has a perpetual free license, nice! One [...]]]></description>
			<content:encoded><![CDATA[<p>Have been fiddling around with <a href="http://www.splunk.com/">Splunk</a> lately.  Splunk's a really good tool to use for log collection and analysis (and that's oversimplifying it, I believe it can even do event correlation...), which really made my love for data mining go crazy of late:P  Best part is that it has a perpetual free license, nice!</p>
<p>One of the things I encountered when using Splunk was that it didn't seem to be indexing all the log files that it was set to monitor.  After some reading up and experimenting the reason became clear: Splunk will not work properly if you set it to monitor too many files.</p>
<p>How many is too many?  For example, setting it to monitor a logfile directory which only has one active log and 100+++ rotated logs, is too many.  What should be done instead is to set it to monitor the active logfile only, and use oneshot adding of the other logfiles to the index you want.</p>
<p>Gonna do some more sharing/writeups about this crazily great tool.  There's really a lot that this thing can do man.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.rayfoo.info/2010/03/troubleshooting-splunk/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

