Archive for the ‘cricket’ Category

Using black hole routing for network security monitoring - Part2

Thursday, March 18th, 2010

Black hole routing is now your friend. You are forwarding traffic not meant to be routed to a termination point on your network. Let’s look at some advanced ways to get more focused network security data from this traffic. For part 1, skip to the next post.

Advanced network monitoring

  • Forward the black hole routes to a passive Honeypot such as Nepenthes
    • This would entail having either the black hole router or the core router forward the traffic to the Honeypot as the next-hop for all the black holed routes.
    • The honey pot will accept connections and will let the operator know what the evil doer was up to. Ex. C2 channels, File transfers, protocol tunnelling, scanning, exploiting, etc.
  • Reconfigure or implement a DNS black list to point the blocked traffic to specific black hole networks. Traffic can be categorized using pre-built filters in the analysis station.
    • Known malware+spyware black list resolves to x.x.x.A
    • Domains, sub-domains or FQDNs that are unrelated to business needs (.tv, .sex) black list resolves to x.x.x.B
    • Domains, sub-domains or FQDNs considered security risks* (.ru, .cn, .ws, .cm, .info, etc.) black list resolves to x.x.x.C. Mcafee publishes a list of the riskiest TLDs, Mapping the mal web.
    • Blocked web sites due to company policy resolves to x.x.x.D
    • In maintenance or no longer available internal servers resolves to x.x.x.E
    • Other blocked content that is of no interest to network or security analysts resolves to 127.0.0.1
  • Advertise specific black hole routes that target nasty netblocks.
    • If there is a requirement in splitting traffic reaching specific routes, convert the link between the core router and the black hole router to an 802.1Q trunk with sub-interfaces. This way you can advertise each group of nasty routes on it’s own sub-interface. Reporting would know which 802.1Q tag is associated with which group of nasties.
    • The risk with explicitely blocking network blocks or domains, is the user community may not be aware they HAVE been blocked administratively. Such that a web page or email track back saying, hey the ressource you are trying to reach is blocked! Some type of advisory should be published that some domains or netblocks may be blocked, check list or contact your help desk.
  • Graph the traffic hitting each sub-interface or interface on the black-hole router using SNMP and Cricket or Cacti.
  • Run ntop, sancp or argus on the analysis station to get session data.
  • Find a way to generate tickets or events back into analysts consoles for interesting traffic hitting the black hole router.
    • Cleaning most of the above classes of traffic will be beneficial for fixing operational problems.
    • Storing this traffic in your network analyzer can serve as an easy to maintain forensic archive for infection vectors.
    • A very cheap alternative to a honeypot.
    • Forensic history of problematic sources in the managed network.
    • Provides a list of workstations and their associated users that are risks to the managed LAN.
    • An easy way to archive the presence of traffic that was marked for deletion by other systems. (Policy based routing, DNS, proxies)
    • Can serve as a source of automated tickets for ops and events for security folks. As what needs to be fixed is always on the managed side of the LAN or WAN.
    • I would love to hear about other benefits
  • Recap of the benefits

  • Provides a looking glass into misconfigured, unauthorized, unexpected or inexistant traffic or destinations.
  • Cheap and easy to setup
  • Maintenance and operation can be very automated
  • Integrates with existing processes (ticketing, change management, backups, monitoring, logging)

Active Directory security event trending as an NSM component

Thursday, March 4th, 2010

NSM Logo

How to use trending tools, like Cricket, Splunk or Cacti, to visualize the Active Directory event log. Time series trending provides a different perspective from the typical SIEM. It is also useful for capacity planning, security analysis and operations.

First, getting Event log data to a trending tool

  • AD Domain Controller global auditing policy enabled (applied on each DC)
  • AD Domain Controller reset group policy audit inherentance for Windows 2003 DCs so the audit policies apply down to actual objects. There is a third party tool available to do this.
  • AD Domain Controller auditing policies for specific GPOs enabled
  • Exporting Events from Windows to the log server
  • Normalizing timestamps on the syslog server
  • Generating event counts every X minutes
  • Generating host/user/group specific event counts every X minutes
  • Collecting the information in the trending Tool (Cricket, Cacti, Splunk)

A little magic goes a long way. Let’s lift the covers and visualize security events that would normal end up in event log wasteland or in Yet Another $$ management console.

AD Domain Controller logging security events

Turn on auditing for interesting events. The point is not to graph everything, it is to graph useful data. I have compiled a list of event log event types that would benefit from graphing. Examples of specific evenIDs or Error Codes are listed at the bottom of the post.

Second, get the data to your syslog server

    The windows event log of the Domain Controller need to be exported to a syslog server. Snare is a great open-source tool for just this purpose. Windows Remote Event collection could also be used to import the data into a management server with Snare, but events can’t be pre-filtered on the AD. It is also possible to use the WMI/DCOM interface or a Splunk agent if the plan is to use Splunk to do the collection and processing.The transport method should be taken into consideration, if the intermediate networks could be compromised, use an IPSec tunnel to transport the syslog data back to the management server.

Third, normalize the syslog data

  • On the syslog server, process the incoming logs to make sure they have consistent timestamps.
  • Create a script to process incoming data to identify instances of events based on EventID or Error Code in addition to any group object fields(example Deletion or Creation)for actions related to an event. You need to do a bit of regex magic at this point to count and collect the data for the various event log message.
  • If using Cricket or Cacti create a script that counts the event types and outputs the value. The output could be piped directly to the collection process or sent to a flat file(recommended) for later retrieval.
  • If using Splunk the timestamps should be automatically normalized. The next step would be to create a search filter that will find and add up all the matched events. Create a report with each family of data to display. Look for some Splunk foo in the future to show how to do it.
  • Run your trending tools collection process as an exec type script or read the flat file input.
  • Apply thresholds, abherrant behaviour detection or other statistical methods to flag problems.

Visualizing events

To make the most of the data, it needs to be optimized for display. Each family of events can be stacked together so they can be viewed in a single timeline. Data can be presented using the negative and positive axis to differentiate events such as logins and logouts.

Windows Audit Event Families of Interest

  • Login
  • Logout
  • Login Types
  • Authentication failures
  • Kerberos error codes
  • Domain logs cleared
  • Account creations/deletion from Account Management Audit Policy
  • Delegation of admin authority
  • File access/changes/creation
  • Group Policy changes
  • Root write gpLink or gpOptions
  • Number of client sessions
  • Numberof DCs
  • Numberof GCs
  • SyncRequest
  • Trust and relationship
  • Inbound replication statistics
  • Outbound replication statistics

This provides a global view of Active Directory events. The specific codes of each family can be obtained from Microsoft KB articles and also from Randy Franklin Smith’s Ultimate Windows Security.com website. I have listed a few codes below, but there are many more on Randy’s web site. He also has a nifty web interface to navigate the various object. An amazing site.

Drilling down

Security views can be created based on asset criticality. Select certain hosts, users or groups of hosts to be displayed individually. This way you can get a macro view and a micro view of what you(or management) may find is important.

Error and EventIds for the various event families

Kerberos authentication failures

Error code and description

  • 6 The username doesn’t exist.
  • 12 Workstation restriction; logon time restriction.
  • 18 Account disabled, expired, or locked out.
  • 23 The user’s password has expired.
  • 24 Pre-authentication failed; usually means bad password
  • 32 Ticket expired. This is a normal event that get frequently logged
    by computer accounts.
  • 37 The workstation’s clock is too far out of synchronization with
    the DC’s clock.

NTLM Error codes

    DEC HEX and description

  • 3221225572 C0000064 user name does not exist
  • 3221225578 C000006A user name is correct but the password is wrong
  • 3221226036 C0000234 user is currently locked out
  • 3221225586 C0000072 account is currently disabled
  • 3221225583 C000006F user tried to logon outside his day of week or
    time of day restrictions
  • 3221225584 C0000070 workstation restriction
  • 3221225875 C0000193 account expiration
  • 3221225585 C0000071 expired password
  • 3221226020 C0000224 user is required to change password at next logon

Logon should be displayed on the positive axis and Logoffs on the negative axis.

  •  
    • 528 Successful Logon
    • 540 Successful Network Logon (Windows 2000, XP, 2003 Only)
    • 529 Logon Failure - Unknown user name or bad password
    • 530 Logon Failure - Account logon time restriction violation
    • 531 Logon Failure - Account currently disabled
    • 532 Logon Failure - The specified user account has expired
    • 533 Logon Failure - User not allowed to logon at this computer
    • 534 Logon Failure - The user has not been granted the requested
      logon type at this machine
    • 535 Logon Failure - The specified account’s password has expired
    • 539 Logon Failure - Account locked out
  • EventID and Description

    Other security events from a Domain Controller

  • 675 Audit account logon events
    Event 675 on a domain controller indicates a
    failed initial attempt to logon via Kerberos at a
    workstation with a domain account usually due
    to a bad password but the failure code indicates
    exactly why authentication failed. See Kerberos
    failure codes below.
  • 676 or
    Failed 672
    Audit
    account logon
    events
    Event 676 gets logged for other types of failed
    authentication. See Kerberos failure codes below.
    NOTE: Windows 2003 Server logs a failed event
    672 instead of 676.
  • 681 or
    Failed 680
    Audit account
    logon events
    Event 675 on a domain controller indicates a
    failed logon via NTLM with a domain account.
    Error code indicates exactly why authentication
    failed. See NTLM error codes below. NOTE:
    Windows 2003 Server logs a failed event 680
    instead of 681.
  • 642 Audit account
    management
    Event 642 indicates a change to the specified user
    account such as a reset password or a disabled
    account being re-enabled. The event’s description
    specifies the type of change.
  • 632, 636,
    660
    Audit account
    management
    All 3 events indicate the specified user was added
    to the specified group. Group scopes Global,
    Local and Universal correspond to the 3 event IDs.
  • 624 Audit account
    management
    New user account was created.
  • 644 Audit account
    management
    Specified user account was locked out after
    repeated logon failures.
  • 517 Audit system events
    The specified user cleared the security log.On Log analysis

    If you are only doing log visualization as described above with no log retention or analysis this would be your next step. What to do with them is a wide open debate. Here is one possible scenario for thos starting out.

    • Splunk as a log retention, analysis, reporting and alerting tool for both security and operations
    • Splunk can be fed operational and/or security events from pretty much anything
    • If more security filtering and processing is required, a SIM/SIEM type service could be setup in parallel
      Event visualization is an indicator to the same extent an IDS alert is an indication to look deeper, it is never an ends. I also advocate leveraging existing tools for quick wins. It may be that a company wants or needs a really advanced log analysis tool with pre-built logic and expert systems. But as I like to say, one step at a time, selecting tools that may cost hundreds of thousands of dollars over their lifetime should not be done without understanding what’s what. Cheers.

DNS event trending as an NSM component

Wednesday, March 3rd, 2010

Cricket Logo

Making use of time series trending for network security monitoring. A short history with examples.

Time series trending tools like Cricket, Cacti and Torrus focus on performance and availability.

Operational teams use these day in day out, the tools are there to organize and display reams of data in a quick and painless way.Time series trending is a source of indicators.

Strengths of time series trending

  • detailed baseline
  • displays seasonality
  • highlights anomalies
  • illustrates subtle changes
  • provisioning planning

Strengths of time series trending using RRD databases

  • fast visualization - anyone having had to suffer SQL based trending tools can attest
  • low maintenance
  • graphing flexibility
  • capabilities can be extended
  • no vendor lockin or annual maintenance fees

Making use of these strengths to identify threats is a secret recipe. Yeah, reaaally.

Here are some DNS based trends that can help quantify, understand and defend or cleanup against extrusion attemps or malware/botnet command and control communications.

  • Trending the number of hits against security related blacklisted entries
  • Trending the number of hits against .cn, .ru, etc. specific domains
  • Trending the number of hits against honeypot entries
  • Trending the number of hits by source IP

The data value can be extended by doing a bit of munging on the interesting bits.

  • Google foo, to automate searchs of the IP addresses or DNS names to see if they are related to specific malware or botnets. This will help prioritize the cleaning efforts.
  • Anomaly detection such as Holt-Winters smoothing with confidence intervals to identify anomalies in seasonality. Which, for the non-initate means: sudden traffic pattern changes such as traffic going 0 during normal hours or high usage during off-hours. Things that may not trigger a threshold, but that are out of seasonality.
  • The information could also be reported in dashboards on a Splunk server for operational teams or management. Trending data over the long term also unshackles the administrator from defining hard limits in time of day use for what is normal and what is not. You can still define hard limits which can generate security events, but analysis is made much easier by seeing the whole time series.

    Once identified the vector of the outbreak needs to be cleaned before nasty malware can move in (think Zeus variant).

    Cricket grapher.cgi development wiki

    Friday, August 31st, 2007

    Cricket LogoCricket developers may be interested to know there is a wiki to keep track of suggestions and ideas on how to bring grapher.cgi into it’s next incarnation.

    (more…)

    Removing spikes from RRD databases

    Friday, August 31st, 2007

    rrdtool logoRRDs are fixed size databases for storing time series data. They collect information given to them and normalize it to permit trending over long periods of time.

    Spurious data may inadvertently make it’s way into a database. Treating this data is possible using the following means:

    • Set the rrd-min and/or rrd-max variable(s) for each datasource when creating new RRD databases
    • Use rrdtool dump to export the RRD database to XML format, edit out the spurious values and import the data back into the RRD database
    • Use rrd tune to apply rrd-min and/or rrd-max variable(s) to an existing RRD database. All values outside the minimum or maximum defined bounds will be set to NaN.
    rrdtool tune <file> --maximum <ds>:<value>
    • Use the perl script removespikes.pl. This would remove all spikes within 1% of the datapoints in the rrd file. If 1% does not fix them, modify the % value up until all the spikes are removed. Though this may eat up some valid values in the process, use with caution!
    perl removespikes.pl -l 1 fastrouter_ethernet0_1.rrd
    • Use rrd_editor, a cross platform win32 or perl/tk tool to seek and remove spikes in an RRD. I have not used the tool, but according to comments it works as advertised. It also lets you easily add or remove RRAs and datasources from an RRD, which is a golden feature for many of us.
    • Use killspike2 an RRD spike removal script distributed as part of the Cricket network management system. I have not used the script, but it is known to work.

    As with any solution, automation and prevention are the keys to a fluid system.

    genDevConfig will automatically set rrd-min and rrd-max values for all config-tree targets it creates for Cricket.

    Download Cricket configuration generators

    Wednesday, August 29th, 2007

    Cricket LogoDownload the latest release candidate version of genDevConfig.

    genDevConfig 2.0 RC2

    Or, download the latest stable version of genRtrConfig.

    genRtrConfig 1.5.50

    To learn more about the Cricket configuration generator for SNMP managed devices, please consult the genDevConfig reference manual.

    To learn more about high performance trending of time series using Cricket, please visit cricket.sourceforge.net.

    Also of interest, an older configuration generator project, CHIRP, has support for some equipment that is not currently fully supported in genDevConfig (Fore ASX, NetScaler, HP AdvanceStack, Riverstone, Foundry BigIron). The perl modules for those device classes included in the CHIRP tool can be easily converted to genDevConfig modules. See the genDevConfig reference manual linked above.