Splunk Boss of the SOC v2 (650 pts)

Submit Flags · Scoreboard · Details

What You Need for this Project


To practice threat hunting, using the Boss of the SOC (BOTS) v2 Dataset.

Connecting to My Splunk Server

Go here: https://splunk3.samsclass.info or here: https://splunk4.samsclass.info

To get past the Basic authentication, log in as student1 with a password of student1

Once you see the Splunk page, log in again as student1 with a password of student1

The "Search" page opens. Enter this search string:

Just below the Search box, on the left side, click Sampling and click 1:1,000.

On the right side, adjust the time range to "All time". Click the green magnifying-glass icon to run the search.

Splunk finds approximately 71,700 events, as shown below.


BOTS2 100: Amber Turing was hoping for Frothly (her beer company) to be acquired by a potential competitor which fell through, but visited their website to find contact information for their executive team. What is the website domain that she visited?

Answer example: google.com (10 pts)

Hints: Search for amber using the correct index value and 1:100 sampling. There are around 440 events found. They all have the same client_ip address--that's Amber's IP address.

Search all events in the correct index with 1:10,000 sampling. There are around 6,898 events. Examine the sourcetype field and restrict it to stream:http. There are around 12 events. Switch to 1:100 sampling.

Examine the src_ip field and restrict it to Amber's IP address. Switch to "No Event Sampling"". There are 1,863 events.

Look at the site and http_referer values and look for names of rival beer makers.

BOTS2 101: Amber found the executive contact information and sent him an email. What is the CEO's name? Provide the first and last name. (5 pts)
Hints: Look for emails (SMTP traffic) to or from Amber Turing. Find her email address.

Find emails containing Amber's email address and the domain name of the competitor you found in question 100. There are only 4 of them. Read them to find the answer.

BOTS2 102: After the initial contact with the CEO, Amber contacted another employee at this competitor. What is that employee's email address? (5 pts)

BOTS2 103: What is the name of the file attachment that Amber sent to a contact at the competitor? (5 pts)

BOTS2 104: What is Amber's personal email address? (15 pts)
Hints: Look for emails sent from Amber Turing.

Review the body of emails that Amber has sent.

Review the email with base64-encoded text for body (or content) and decode the base64.

BOTS2 105: What version of TOR did Amber install to obfuscate her web browsing? Answer guidance: three numbers with dots between them, like 1.2.3 (10 pts)
Hint: Find the typical domain used to download Tor. Look for events caused by installing software.

BOTS2 200: What is the public IPv4 address of the server running www.brewertalk.com? (10 pts)
Hints: Search for events containing that FQDN, recording DNS resolutions. Find queries to a known public DNS server.

BOTS2 201: Provide the IP address of the system used to run a web vulnerability scan against www.brewertalk.com. (10 pts)
Hint: App scanners are often 'noisy', sending many invalid requests. Search for events with the target URL in them and an action of "blocked".

BOTS2 202: The IP address from question 201 is also being used by a likely different piece of software to attack a URI path. What is the URI path? Answer example: phpinfo.php (10 pts)
Hints: Analyze all HTTP traffic from the scanning system to www.brewertalk.com, and inspect URI values.

BOTS2 203: What SQL function is being abused on the uri path from question 202? (10 pts)
Hints: Find all events with sourcetype="stream:http" using the POST method from the attacker's IP address you found in question 201. There are 9,708 such events.

Restrict the uri_path to the value you found in question 202. There are 1,188 such events.

Examine the 590 form_data field for suspicious entries. This is not hard to do: view them 100 on a page and scan them quickly. Find the function the attacker is using to change data in the database.

BOTS2 204: What is Frank Ester's password salt value on www.brewertalk.com? (20 pts)
Hints: Start with the 1,188 events you used in the previous question. Restrict these to the events using the SQL function you found in that question. There are 136 such events.

Look in the dest_content field of these events for error messages. Narrow down the events to only those that include the suspected SQL injection traffic. Stream HTTP events contain the details you need. Filter on the source IP, dest, IP, HTTP user agent and URI path.

These events will probably make a lot more sense if you reverse the Splunk event ordering by piping your search results to the reverse command. This will show you the first SQL injection commands at the top of the list and later events below.

There is a lot of data captured in these events. You are looking for two pieces of data in the dest_content field. The first can be found following the string 'XPATH syntax error: '

The other important piece of data in the dest_content field can be extracted with the following regular expression: '

\s+(?[^<]+)' Look for the sqli_query values that are stealing salt values.

BOTS2 205: What is user btun's password on brewertalk.com? (20 pts)
Hints: His hashed password and salt was stolen via SQLi and captured in Splunk. Also note a 'top 1000' password list is available in a Splunk lookup table file called 'top_1000.csv'. Use '| inputlookup top_1000.csv' to inspect it.

By inspecting the code for this forum software, it can be determined that the stored password hash is computed as follows: md5( md5(salt) + md5(plaintext password) ) where '+' is simple string concatenation.

The Splunk eval command includes an md5 hash function. Beware that the exploit used in this attack chops the final character from the password hash and includes it as a single character string in the next SQLi extraction. When you use this string, either add the character back to the end of the hash, or just use a wildcard match on the beginning of it.

Btun's salt value is 'tlX7cQPE' and his complete password hash is 'f91904c1dd2723d5911eeba409cc0d14'

BOTS2 207: What was the value of the cookie that Kevin's browser transmitted to the malicious URL as part of a XSS attack? Answer guidance: All digits. Not the cookie name or symbols like an equal sign. (10 pts)
Hints: Check out sourcetype=stream:http

Inspect the uri_query field.

BOTS2 208: The brewertalk.com web site employed Cross Site Request Forgery (CSRF) techniques. What was the value of the anti-CSRF token that was stolen from Kevin Lagerfield's computer and used to help create an unauthorized admin user on brewertalk.com? (20 pts)
Hints: Anti-CSRF tokens are usually hidden form elements set when the browser loads an HTML page containing a form. If the form is submitted without the anti-CSRF token, the backend code of the website rejects the transaction as it might have come from a malicious source rather than from a legitimate user of the form.

One of the many ways that an attacker can abuse a cross site scripting vulnerability is to use it to defeat CSRF protections. If you carefully inspect XSS attacks in the data set, you will stumble on some malicious code that is stealing the anti-CSRF token.

On brewertalk.com, users created with usergroup=4 are administrators.

The name of the anti-CSRF token is my_post_key

BOTS2 209: What brewertalk.com username was maliciously created by a spearphishing attack? (15 pts)
Hints: The attacker was trying to masquerade as something that would look legitimate to a casual observer.

The attacker stole a trick from domain squatters by using a homograph attack. More info on homograph attacks can be found on Wikipedia.

The password of this new, unauthorized, malicious administrative account is beer_lulz

BOTS2 300: According to Frothly's records, what is the likely MAC address of Mallory's corporate MacBook? Answer guidance: Her corporate MacBook has the hostname MACLORY-AIR13. (10 pts)
Hints: Use Asset Center in ES.

BOTS2 301: What episode of Game of Thrones is Mallory excited to watch? Answer guidance: Submit the HBO title of the episode. (10 pts)
Hints: Look for video files downloaded to MACLORY-AIR13.

BOTS2 302: What is Mallory Krauesen's phone number? Answer guidance: ddd-ddd-dddd where d=[0-9]. No country code. (10 pts)
Hints: Use Identity Center in ES.

BOTS2 303: Enterprise Security contains a threat list notable event for MACLORY-AIR13 and suspect IP address What is the name of the threatlist (i.e. Threat Group) that is triggering the notable? (10 pts)
Hints: Look for threat activity from Mallory's MacBook in the Incident Review dashboard.

BOTS2 304: Considering the threatlist you found in the question above, and related data, what protocol often used for file transfer is actually responsible for the generated traffic? (10 pts)
Hints: Do you see MACLORY-AIR13 communicating with known Tor addresses? That's misleading.

BOTS2 305: Mallory's critical PowerPoint presentation on her MacBook gets encrypted by ransomware on August 18. At what hour, minute, and second does this actually happen? Answer guidance: Provide the time in PDT. Use the 24h format HH:MM:SS, using leading zeroes if needed. Do not use Splunk's _time (index time). (15 pts)
Hints: People that work on PowerPoint presentations generally save them in their Documents folder.

The time that Splunk indexed this information might not be the time the file was modified.

BOTS2 306: How many seconds elapsed between the time the ransomware executable was written to disk on MACLORY-AIR13 and the first local file encryption? Answer guidance: Use the index times (_time) instead of other timestamps in the events. (15 pts)
Hints: What time did the 'Office 2016 Patcher.app' get added to MACLORY-AIR13's filesystem?

What time was the first file with *.crypt added to MACLORY-AIR13's filesystem?

BOTS2 307: Kevin Lagerfield used a USB drive to move malware onto kutekitten, Mallory's personal MacBook. She ran the malware, which obfuscates itself during execution. Provide the vendor name of the USB drive Kevin likely used. Answer Guidance: Use time correlation to identify the USB drive. (15 pts)
Hints: osquery_results is a great sourcetype to review.

Look for unusual files in a place that Mallory would come across them.

If you can figure out what kind of malware this is, do some open source intelligence research to determine how it behaves. Find an online database of USB vendors.

Various sourcetypes can tell you how things look when the run. Look at 'ps' and look at 'osquery_results' from kutekitten.

BOTS2 308: What programming language is at least part of the malware from the question above written in? (15 pts)
Hints: Review the hints for question 307.

BOTS2 309: The malware from the two questions above appears as a specific process name in the process table when it is running. What is it? (10 pts)
Hints: Review the hints for question 307.

BOTS2 310: The malware infecting kutekitten uses dynamic DNS destinations to communicate with two C&C servers shortly after installation. What is the fully-qualified domain name (FQDN) of the first (alphabetically) of these destinations? (10 pts)
Hints: Have a look at the stream:dns sourcetype and observe queries from kutekitten.

You need a lookup. Find one, and also review this: https://www.splunk.com/blog/2015/08/04/detecting-dynamic-dns-domains-in-splunk.html

BOTS2 311: From the question above, what is the fully-qualified domain name (FQDN) of the second (alphabetically) contacted C&C server? (10 pts)
Hints: Review the hints for question 310.

BOTS2 312: What is the average Alexa 1M rank of the domains between August 18 and August 19 that MACLORY-AIR13 tries to resolve while connected via VPN to the corporate network? Answer guidance: Round to two decimal places. Remember to include domains with no rank in your average! Answer example: 3.23 or 223234.91 (15 pts)
Hints: You're going to need a lookup. Are there any loaded in the system that might help you?

We want the average of ranks. Not the average of hits to the domains.


BOTS2 313: Two .jpg-formatted photos of Mallory exist in Kevin Lagerfield's server home directory that have eight-character file names, not counting the .jpg extension. Both photos were encrypted by the ransomware. One of the photos can be downloaded at the following link, replacing 8CHARACTERS with the eight characters from the file name. https://splunk.box.com/v/8CHARACTERS After you download the file to your computer, decrypt the file using the encryption key used by the ransomware. What is the complete line of text in the photo, including any punctuation? Answer guidance: The encryption key can be found in Splunk. (20 pts)
Hints: Understanding from OSINT how this ransomware behaves is key to the answer.

This ransomware is called 'Patcher' and it is terribly written and uses *NIX command line tools to wreak havoc.

Patcher uses the UNIX zip utility.

BOTS2 400: A Federal law enforcement agency reports that Taedonggang often spearphishes its victims with zip files that have to be opened with a password. What is the name of the attachment sent to Frothly by a malicious Taedonggang actor? (10 pts)
Hints: Frothly uses the Splunk wiredata product 'Stream' to collect email metadata. Look at the sourcetype stream:smtp

The question mentions that Taedonggang sends a 'zip' file. Look in the sourcetype in hint 1 for attachments with a .zip extension.

BOTS2 401: The Taedonggang APT group encrypts most of their traffic with SSL. What is the "SSL Issuer" that they use for the majority of their traffic? Answer guidance: Copy the field exactly, including spaces. (10 pts)
Hints: You might need to get more information before you tackle this question. Have you figured out the IP address of Taedonggang's server?

Frothly currently only collects SSL data with Stream. Look at the sourcetype 'stream:TCP' for more information about SSL data.

Issuer' is a value found in a TLS/SSL certificate. Try and find SSL/TLS certificates tied to the IP address of Taedonggang's attacking server.

Look in sourcetype=stream:tcp with the IP address of Taedonggang and the field ssl_issuer.

BOTS2 402: Threat indicators for a specific file triggered notable events on two distinct workstations. What IP address did both workstations have a connection with? (10 pts)
Hints: Check out the Incident Review dashboard.

Open notable events for more details.

Look for two notable events with the exact same title that has a filename in it.

BOTS2 403: Based on the IP address found in question 402, what domain of interest is associated with that IP address? (10 pts)
Hints: Investigations might shed some light on this.

Did you know Enterprise Security has the ability to collect notes and screenshots from other analysts including threat intelligence?

Find the investigation with the attachment to gain some additional intelligence about the threat.

BOTS2 405: What is the first and last name of the poor innocent sap who was implicated in the metadata of the file that executed PowerShell Empire on the first victim's workstation? Answer example: John Smith (15 pts)
Hints: This is an open source intelligence question. You will need to find the file name/hash of the file that first infected Frothly (think of the extracted file from the answer to question 400) and then pivot off to the internet. If you have found the file that first infected Frothly with PowerShell Empire take a look at the Incident Review dashboard. You should find the hash and pivot off that hash in open source intelligence sources. Look at the chart in https://www.splunk.com/blog/2017/07/21/work-flow-ing-your-osint.html for a commonly-used sandbox site that takes file hashes.

Find the answer to question 400. Look in the logs to find the name of document file extracted from the zipped attachment. Search for that filename in the Incident Review 'Search' filter. Take the hash mentioned in the 'comments' field and search Virustotal for that hash.

BOTS2 406: What is the average Shannon entropy score of the subdomain containing UDP-exfiltrated data? Answer guidance: Cut off, not rounded, to the first decimal place. Answer examples: 3.2 or 223234.9 (15 pts)
Hints: First you will need to find the domain associated with the exfiltrated data. Look at the Stream metadata for a UDP protocol often used to exfiltrate data.

Review the stream:dns sourcetype and find the IP address that has a high number of queries but is not a normal/legitimate target for DNS queries (IE not RFC1918 or Open DNS server). Look at the domain in the queries to that IP address. Pivot off of that to calculate shannon entropy.

If you have never calculated Shannon Entropy, look at the documents for the tool 'URL TOOLBOX' or recent entries in https://www.splunk.com/blog/2017/07/06/hunting-with-splunk-the-basics.html. This will teach you how to calculate Shannon entropy. Also review https://www.splunk.com/pdfs/events/govsummit/hunting_the_known_unknowns_with_DNS.pdf where you can learn how to detect DNS exfiltration

BOTS2 407: To maintain persistence in the Frothly network, Taedonggang APT configured several Scheduled Tasks to beacon back to their C2 server. What single webpage is most contacted by these Scheduled Tasks? Answer guidance: Remove the path and type a single value with an extension. Answer example: index.php or images.html (20 pts)
Hints: Review the question for keywords and search against the hosts in the network.

Look in the sysmon logs for workstations: XmlWinEventLog:Microsoft-Windows-Sysmon/Operational if you haven't figured out where to start!

Once you find the event for scheduled tasks, you will need to pivot to the sourcetype=WinRegistry. In that sourcetype, look for where the scheduled task receives its destination information. You will need to decode it!

BOTS2 408: The APT group Taedonggang is always building more infrastructure to attack future victims. Provide the IPV4 IP address of a Taedonggang controlled server that has a completely different first octet to other Taedonggang controlled infrastructure. Answer guidance: has a different first octet than (20 pts)
Hints: Look through your notes of this incident, if you have any. Specifically look at the IP addresses used by Taedonggang. You will need to take information from the Taedonggang infrastructure seen attacking Frothy and pivot to open source intelligence. Specifically look at the C2

IP address used by Taedonggang to control their PowerShell Empire agents. Remember that less is more! Sometimes the absence of data helps you find things.

Look at the SSL certificates. Think about fields that you can pivot on in open source intelligence.

Taking information from hint number 3. Pivot off of different fields in an open source intelligence website that catalogs SSL certificates until you find the server! Review https://www.splunk.com/blog/2017/07/21/work-flow-ing-your-osint if you need help finding OSINT websites

BOTS2 409: The Taedonggang group had several issues exfiltrating data. Determine how many bytes were successfully transferred in their final, mostly successful attempt to exfiltrate files via a method using TCP, using only the data available in Splunk logs. Use 1024 for byte conversion. (20 pts)
Hints: The data for this question is located in sourcetype=stream:ftp

Review the sourcetype referenced in hint one on August 25, 2017. You'll notice four distinct bursts of activity. Look at the largest one for the information you require. Find the start message in the logs (there is no stop). A key word is 'successful'.

The data is NOT in a Splunk field of 'bytes'. You will need to write a regex against the data to find the answer. Review https://www.splunk.com/blog/2017/08/30/rex-groks-gibberish.html if you need help writing a regex!

The information in the field you are parsing will have something like 'Megabytes per second' and 'Kilobytes' per second. Make sure you do your calculations with those terms in mind.

BOTS2 500: Individual clicks made by a user when interacting with a website are associated with each other using session identifiers. You can find session identifiers in the stream:http sourcetype. The Frothly store website session identifier is found in one of the stream:http fields and does not change throughout the user session. What session identifier is assigned to dberry398@mail.com when visiting the Frothly store for the very first time? Answer guidance: Provide the value of the field, not the field name. (10 pts)
Hints: Find the source IP address that our user of interest is using, then broaden your search such that you can view all events specific to the user's src ip address.

HTTP cookies often contain information specific to a user session, including session identifiers.

"After you get the events specific to the user's src ip address, you can append a '| reverse |table cookie' to get a better view of the cookies that the user clicked."

BOTS2 501: How many unique user ids are associated with a grand total order of $1000 or more? (15 pts)
Hints: When a user fills out a web form passing information such as username, password, credit card numbers, etc., it's passed via a standard http field (form_data) which is captured by stream:http. Extract the username from that field and store it in a new field.

You're going to need to look deeper into the packet at a field called dest_content to extract the grand order total. Look for the following string and use it in a regular expression to capture the value: 'grand_total'.

The 'stats' command is useful for helping you to link several pieces of context together that occur within a single clickstream.

BOTS2 502: Which user, identified by their email address, edited their profile before placing an order over $1000 in the same clickstream? Answer guidance: Provide the user ID, not other values found from the profile edit, such as name. (15 pts)
Hints: Explore the stream:http data and try to determine what sort of context you can derive from the name of pages that the user is visiting.

The uri_path field will tell you which pages either the user has clicked on and provide hints as to what their session looks like. Spend some time looking through the various uri_path values or get clever and think about keywords that may lead you to the uri_path value that indicates a user editing their account profile.

This looks interesting: '/magento2/customer/account/editPost/

BOTS2 503: What street address was used most often as the shipping address across multiple accounts, when the billing address does not match the shipping address? Answer example: 123 Sesame St (15 pts)
Hints: Perform all field extractions involving information of interest first before joining the events together and applying further constraints. The usernames and passwords need to be extracted from the form_data field. The shipping and billing address can be found in the src_content field.

Use the stats command to create a list of interesting context by session identifier. Remember to apply your constraints after the stats command to look at sessions where the shipping and billing address are both present in the session and they are not equal to each other.

Users who have made a purchase in the past automatically have previous shipping destinations displayed in their browser, which can be found in the stream:http field src_content. After the user changes their shipping address, you will see a different value displayed for the shipping address. If you are fixated on the first shipping address found within a stream, you are going to be stuck. Take a look in this url for the user-modified shipping address during the payment process: http://store.froth.ly/magento2/rest/default/V1/carts/mine/payment-information

BOTS2 504: What is the domain name used in email addresses by someone creating multiple accounts on the Frothly store website (http://store.froth.ly) that appear to have machine-generated usernames? (15 pts)
Hints: Extract the session identifier from the form_key field and use it to tie together context of interest.

The usernames and passwords need to be extracted from the form_data field. Here is an example of how to extract the domain from a sample splUsername field: |rex field=splUsername '(?[^\.|^\@]+.[^.]+)$'

Take a look at Enterprise Security and you might find a notable event related to the web store identifying anomalous activity.

BOTS2 505: Which user ID experienced the most logins to their account from different IP address and user agent combinations? Answer guidance: The user ID is an email address. (10 pts)
Hints: Use the rex command to grab the username and session identifier from the cookie and form_data fields.

Since we are looking for unique combinations of IP and user agent, it's helpful to combine those two values into a single field. One way to do this is to use the eval command. https://answers.splunk.com/answers/100463/adding-strings-from-2-fields-into-1.html After you get your cont

ext listed by session identifier, you can run stats a second time to further narrow your results down to the unique IP and user agent combination.

BOTS2 506: What is the most popular coupon code being used successfully on the site? (20 pts)
Hints: The coupon codes need to be extracted from the request field.

You will need to dig a little deeper to determine whether or not a coupon submission was successful. Take a look in the dest_content to figure that out.

Use the eval command to create a field name to store whether a coupon submission was successful. If you can find strings in the dest_content indicative of success or failure, you can use this example as a framework: eval newField=if(like(dest_field,'%your_search_string%'),'Success',dest_field)

BOTS2 507: Several user accounts sharing a common password is usually a precursor to undesirable scenario orchestrated by a fraudster. Which password is being seen most often across users logging into http://store.froth.ly. (10 pts)
Hints: All the context you need to get this question is in the cookie and form_data fields.

Forget about the session identifier for this question. Use stats to get your context lined up 'by password'

Use the stats aggregate function 'dc' to get a distinct count of the values within a particular field. Example: ...|stats dc(field1) as distinct_count by field2.

BOTS2 508: Which HTML page was most clicked by users before landing on http://store.froth.ly/magento2/checkout/ on August 19th? Answer guidance: Use earliest=1503126000 and latest=1503212400 to identify August 19th. Answer example: http://store.froth.ly/magento2/bigbrew.html (15 pts)
Hints: Set your date range appropriately and look for events containing the desired URL.

The answer is directly referrenced within the events containing the specified URL.

The http_referrer field will tell us which page the user was on just prior to landing on the specified checkout page.

BOTS2 509: Which HTTP user agent is associated with a fraudster who appears to be gaming the site by unsuccessfully testing multiple coupon codes? (20 pts)
Hints: The coupon codes need to be extracted from the request field.

You will need to dig a little deeper to determine whether or not a coupon submission was successful. Take a look in the dest_content to figure that out.

Use the eval command to create a field name to store whether a coupon submission was successful. If you can find a strings in the dest_content indicative of success or failure, you can use this example as a framework: eval newField=if(like(dest_field,'%your_search_string%'),'Success',dest_field)

Posted 8-21-19