Friday, April 1, 2016

Penetration Testing data management and reporting tool - MagicTree

Time Series database
MagicTree is a penetration tester productivity tool. It is designed to allow easy and straightforward data consolidation, querying, external command execution and report generation. In case you wonder, "Tree" is because all the data is stored in a tree structure, and "Magic" is because it is designed to magically do the most cumbersome and boring part of penetration testing - data management and reporting.

MagicTree stores data in a tree structure. This is a natural way for representing the information that is gathered during a network test: a host has ports, which have services, applications, vulnerabilities, etc. The tree like structure is also flexible in terms of adding new information without disturbing the existing data structure: if you at some point decide that you need the MAC address of the host, you just add another child node to the host node.

Once you have all the data you want, you can use it to produce a report. Reports are generated from templates. A template is simply an OpenOffice or Microsoft Word file that contains all the static data and formatting you want (your company logos, headers, footers, etc.) and placeholders for the data coming from MagicTree.

MagicTree Installation:
First, Ubuntu does not contain Java JRE. Unless you have a reason to do otherwise, install openjdk's JRE. The following command can be used: 
sudo apt-get install openjdk-6-jre
No installation is required for MagicTree. The application is distrubuted as a single JAR file which has to be executed with JRE. Just save the file on your desktop. Double-click on it to execute it or execute it from command line:
java -jar MagicTree.jar


On startup, it will automatically create .magictree directory under users home directory and unpack necessary files there. The files will be overwritten if JAR contains more recently modified files. Thanks to this files changed in the most recent build will replace old ones, assuming your local clock is correct.

Intelligent Data Management Framework for Microsoft Dynamics AX

                                   time series data sets
The Intelligent Data Management Framework for Microsoft Dynamics® AX helps administrators optimize Dynamics AX database layout by intelligently monitoring index usage, index layout, fragmentation and query patterns thru indices. The framework allows reduction of the database size by purging transaction records from a set of related entities, while maintaining the consistency and integrity of production data. The Intelligent Data Management Framework provides customers and partners the ability to identify and discover related entities based on Microsoft Dynamics AX metadata and to determine the purging criteria for entities and transactions. The Intelligent Data Management Framework also analyzes the production database to determine current usage patterns and assesses the health of the Microsoft Dynamics AX application. 

Data Management and Analysis

Time-Series Data
An important element of IWRM modeling is management and processing of data. Data is the fundamental key to a successful representation of a system. Without proper data representation, the model is lost. I decided to do a quick search to see what people are using today. Doing simple Google searches on the various tools, I found HEC-DSSVue and Hydstra to be the most popular (assuming Internet search returns is a measure of popularity). Here are the search results:

HEC-DSSVue => 320 unique results
Hydstra => 390 unique results
Aquarius => 100 unique results

In looking at scholarly documents online, I was able to compare the 3 tools (shown above) by counting the different kinds of models these were used for. HEC-DSSVue seems to have been used most often for water resources and flood modeling, Aquarius for water resources and operations models, and Hydstra for water quality and runoff models. Table 1 summarizes the results from this comparison.

As part of my research, I posted a question on some LinkedIn groups to see what people had to say. You can see the comments people made by following the links to the posts I made on these groups in the links below. Note that you need to be a member of these groups to see the comments.

AWRA Group Comment
AWWA Group Comment
Assoc.Water and Enviro-Modeling Group Comment

sci.geo.hydrology Forum Comment

These are the tools I am at least somewhat familiar with:

HEC-DSSVue
This tool is a powerful data management system with over 60 math functions that can be performed on the data and many utility functions for dealing with the data, such as interpolation. The GUI provides a robust data viewing capability that can be customized to suit your needs. The obvious benefit of this tool is that it's extendable and free.

This tool is widely used in the government, academic, and consulting arenas since it has been around for a while and is free to use with decent software documentation.

1. Retrieving data
There are many ways to retrieve data for use in HEC-DSSVue. You can enter data manually, import text files, SHEF data, csv files, USGS data, NCDC data, CDEC data, DSSUTL format files, and even image files (i.e. gridded data). Another way is using a Java plug-in, which could be used to extend the software (usually by adding a menu button to the main screen of the software). The software also comes with an MS Excel add-in for retrieving and storing data directly from the spreadsheet file. An available plug-in also allows users to retrieve snotel data from the NRCS online database. Data files with sizes up to 8Gb can be stored in HEC-DSSVue.


2. Viewing data
Data can be viewed in tabular or graphical format. There is also an option to view data in groups as shown in this example, taken from the HEC website:

Plots can be customized in many ways, including 


3. Data types
The data can be represented in 4 different ways, which controls how interpolation is performed. These include:


INST-CUM (instantaneous cumulative, i.e. precipitation mass curve) 
INST-VAL (instantaneous value, i.e. river stages)
PER-AVER (period average, i.e. monthly flow values)
PER-CUM (period cumulative, i.e. incremental precipitation)


Interpolation between points in a dataset varies depending on the datatype chosen, which is demonstrated in the HEC-DSS user manual as follows:


4. Data analysis
There really is no limit to the math functions you can perform on datasets in HEC-DSSVue, but some functions may require a script to be written, which can be overwhelming to the novice user. Basic functions can be executed graphically, such as accumulation, absolute value, change units, and subtracting data.

A nice feature of HEC-DSSVue is the ability to perform hydrologic functions on the datasets. These functions include Muskingum Routing, Straddle Stagger Routing, Modified Puls Routing, Rating Table, Reverse Rating Table, Two Variable Rating Table, Decaying Basin Wetness, Shift Adjustment, Period Constants, Multiple Linear Regression, Apply Multiple Linear Regression, Conic Interpolation, Polynomial, Polynomial with Integral, and Flow Accumulator Gage Processor.

The following statistical functions can be invoked:  Basic (statistics), Linear Regression, Cyclic Analysis, Duration Analysis, and Frequency Plot.

Aquarius

Compared to HEC-DSS, Aquarius is said to be more visual and user-friendly. This is mostly attributed to the object-oriented framework in which the user can work with the data. This might be something to consider when comparing the tool to HEC-DSS since the latter is often less able to help the user visualize the data and the functions applied to them. Another major difference I can see is that Aquarius provides some functionality to streamline the acquisition of data from the field. I haven't seen enough of the tool to say whether any computational aspect presents a great benefit over HEC-DSSVue, but they seem to be comparable in this regard.

1. Retrieving data
There are many ways to retrieve data for use in Aquarius. One of the obvious benefits of Aquarius (compared to HEC-DSS is that it is very easy to acquire data from data loggers in the field. You can enter data manually via ad-hoc manual data entry of datasets such as portable field meter data, grab samples, or stage vs. discharge point pair data. You can also import the most common file format such as ASCII or text-based formats (e.g. comma separated values (.csv) files), Aquarius Optimized Packages (.aop), AquariusML (.xml) and HYDAT text files. It is also very easy to join chunks of data to the end of existing datasets but this is trivial in HEC-DSS as well.

Another important aspect of Aquarius is data acquisition from outside sources, including Distributed Control Systems (DCS), Supervisory Control and Data Acquisition (SCADA) systems, Programmable Logic Controllers (PLC), Batch execution systems, Lab Information Management Systems (LIMS), and Relational databases, and XML files.

2. Viewing data
Data can be viewed in tabular or graphical format. There is also an option to view data in groups as shown in this example, taken from the Aquarius website:

Also, here is a view of the main screen that allows data and functions to be dealt with in an object-oriented fashion.



3. Data types
The data can be represented in different ways, which controls how interpolation and accumulation are performed. This question has been asked of the company and I await their reply.

4. Data analysis
Functions similar to those described under HEC-DSS can also be performed in Aquarius. Functions and utilities also include data corrections, develop discharge curves, and statistics. It can help with paired data and performing lagged regressions, linear regression, autoregressive models, and artificial neural network. A special note needs to be made about why someone might choose Aquarius over the free HEC-DSSVue. After spending some time with support staff on the phone, I learned that even the USACE (who develops HEC-DSSVue) chooses to use Aquarius because of their robust data analysis features, such as producing rating curves for stream gages and modeling purposes. Apparently, they currently have about 200 customers.

Hydstra
Hydstra is similar to Aquarius with a couple of differences: integration with GIS and no object-oriented pallet for building the data model. Hydstra is developed by Kisters. Data can be viewed graphically and in tabular form. It appears that Hydstra is much more popular than Aquarius.

It appears that Hydstra has been used for more for real-time data management rather than dealing with larger, historic records that might be incorporated into IWRM models.

Interestingly, Hydstra datasets can be published online.

I sent the Kisters team a query into how Hydstra compares to Hec-DSSVue and got this reply:

Kisters have well over 500+ water agencies world wide using our water management  products. These range from national systems such as the UK Environmental Agency, South African Department of Water Affairs, Australian Bureau of Meteorology through to large state government water agencies and private companies such as Pacific Gas and Electric Company [PG&E California]. 

I would need to do a full cross check of the functionality offered by HEC-DSSVue and the various Kisters products to understand the exact differences. At face value however the Kisters products have far more functionality as our products offer extensive data management, reporting and analysis tools. Simple functions such as "batching", producing 'pdf" outputs etc are basic functionality within Kisters. 

Kisters offer a "richer" suite of tools to undertake plots, data tabulation, editing, and manipulation.