2rarea2t LIVEcommunity- MineMel to Extract Indleators From generic API- LIVEcommunity 218757
MineMeld to Extract Indicators From generic API % &
& Xhoms 07-07-2018 01:57 Al
100%
helpful
(212)
Introduction
Although MineMeld was conceived as a threat sharing platform, reality has shown many
users are taking advantage of its open and flexible engine to extract dynamic data (not
threat indicators) from generic APIs.
* The highly successful case of extracting 0365 dynamic data (IP's, domains and
URLs) from its public facing API Endpoint
* Many users relying on MineMeld to track the public IP space from cloud and CDN
providers like AWS, Azure, CloudFlare, Fastly as a much more robust and scalable
alternative to mapping them with FQDN objects.
* Oreven people using MineMeld to extract the list of URL's to videos published in
specific YouTube playlist or channels via the corresponding Google API.
Allthese are examples of MineMeld being used to extract dynamic data from public API's.
Depending on the source, a new class (python code) may be needed to implement the
client-side logic of the API we're willing to mine. But, in many case, the already available
ready-to-consume "generic classes" could be used instead. This way the user could
"mine" its generic API without the need to deep dive into the GitHub project
contribution.
The "generic classes"
There are, basically, three "generic classes" that can be reused in many applications:
+ The HTTPFT class: Create a prototype for this class when you need to extract
dynamic data from content delivered in HTML or PlainText (text/plain, text/html)
+ The SimpleJSON class (I love this one!): Do you need to extract dynamic data from
an API that delivers the response as a JSON Document? You're all set with a
itpssitve paloaltonotworks.comftSAkbiatleprntnagek-iAMineMeldAticlesiarice-8/290 wr2rearo024
itpssitve paloaltonotworks.comftSAkbiatleprntnagek-iAMineMeldAticlesiarice-8/290
LivEcommunity~ MineMeld to Extract Indicators From generic API - LIVEcommunity - 218757
protorype of this class!
+ The CSVFT class: Some services still use variants of CSV (delimiter-based multi-
field lines) to deliver its content.
The following is the rule of thumb that will let you know if the API you want to extract
dynamic data from can be "mined" using MineMeld by providing just a prototype for one
of these classess (without providing a single line of code!)
1. The transport must be HTTP/HTTPS
2. None or basic authentication (user + password)
3. Single transaction (one call retrieves the whole indicator list - no pagination)
4. Indicators are provided in plain, html, csv or json format.
The following sections in this article will teach you how to use these generic classes to
mine an example API that provides real-time temperature for four MineMeld-relateed
cities in the world:
Format APIURL
csv Jes
HTML Jes
JSON __https://test.minemeld.com/json
inemeld.com/esv
inemeld.com/htm|
Mining a CSV API
We will start with CSV because it is, probably, the easiest one between the generic
classes. The theory of operations is:
+ The CSVFT class will perform a HTTPS API Call without (or basic) authentication.
The expected result will be table-like document where every line will contain an
indicator plus additional attributes separated by a known delimiter.
+ Before the CSV parser kicks in, a regex pattern will be used to discard lines that
should not be processed (i.e. comments)
+ The prototype will provide configuration elements to the CSV parser to perform
the correct field extraction from each line.
First of all, lets call the demo csv api and analyze the results:
272rearo024
LIVEcommunity- MineMel to Extract Indleators From gener API-LIVEcommunity 218757
Request ->
GET /esv HTTP/1.1
Host: test.mineneld.con
Response Headers <-
HTTP/2.8 200 OK
content-type: text/esv
content-disposition: attachment; filenane="mineneldtest.csv"
content-length: 432
Response Body <-
# Real-Time temperature of MineMelc-related cities in the world.
url, country, region, city, temperature
https: //ajuntanent barcelona. cat/turisme/en/ ,£S,Catalunya,Barcelona,12.24
http: //m.turisno.comune.parma. it/en, IT, Emilia-Ronagna, Parma, 16.03
http: //santaclaraca.gov/visitors,US,California,Santa Clara,8.98
+ The API returns a test /csv content and suggests us to store the results as an
attachment with the name a
csv.
+ Regarding the content, it looks like 4 data records are provided with up to 5 fields
ines that do
each: url, country, region, city and temperature. There are, as well, two
not provide any value and that should be discarded (the one with the comment and
the one withe the field headers)
* And, as for the CSV parsing tasl
looks like the fields are clearly delimited by the
‘comma character.
We're ready to go to configure our prototype to mine this API with the CSVFT class.
Step 1: Create a new prototype using any CSVFT-based one as starting
point.
Weill use the prototype named "sstabusech.ipblacklist" as our starting point. Just,
navigate to the config panel, click on the lower right icon (the one with the three lines) to
expose the prototype library and click on the sslabusechone.
itpssitve paloaltonotworks.com/tSAkbiartleprintnagek>-AMineMoldAiclesiaricle-8/290 snr2rearo024 LIVEcommunity- MineMel to Extract Indleators From gener API-LIVEcommunity - 218757
Cliking on the ss/abuse prototype will reveal its configur: the following
picture,
The most important value in the prototype is the class is applies to. In this case, the
CSVFT one we want to leverage. Our mission is to create a new prototype and to change
its configuration to accomplish our goal to mine the demo CSV API. The following is the
set of changes we will introduce:
* Name, Description and Tags (to make it searchable in the prototype library)
+ Inside the CONFIG section:
+ We will replace urlwith https://test minemeld.com/csv
+ Weill change the indicator typeto URL and set the confidence level to 100
* Provide our own set of fieldnames
+ Define the ignore regex pattern as "*(2!https)" (to discard all lines except the
ones starting with "https")
* Describe the source.nameas minemeld-test
‘Simply click on the NEW button and modify the prototype as shown in the following
picture.
itpssitve paloaltonotworks.comftSAkbiatleprntnagek-iAMineMeldAticlesiarice-8/290 ann2irain0n LIVEcommunity- MineMel to Extract Indicators From generic API- LIVEcommunity - 218757
Please, take a closer look to the fieldname list and realize the first name in our prototype
list to be “indicator” (in the CSV body the first field was suggested to be “url” instead).
The CSV engine inside the CSVFT class will extract all comma separated values from each
line and use the one matching the column named "indicator" as the value containing the
indicator we want to extract. Any other fieldname will be extracted and attached as
additional attributes to the indicator.
Step 2: Clone the prototype as a working node (miner) in the MineMeld
engine
Clicking on OK will store this brand new prototype into the library and the browser will
be sent to it. Just change the search field to reveal our csv prototype and then click oni.
Now itis time to clone this prototype into a working node into the MineMeld engine. So
just click on the CLONE button, give the new miner node a name and commit the new
configuration.
itpssitve paloaltonotworks.comftSAkbiatleprntnagek-iAMineMeldAticlesiarice-8/290 si?2irain0n LIVEcommunity- MineMeld to Extract Indleators From generic API- LIVEcommunty - 218757
Step 3: Verify the node status.
‘Once the engine restarts you should see a new node in your MineMeld en
indicators in it. Click on it, then click on its LOG button and, finally, click on any log entry
to reveal the indicator details.
itpssitve paloaltonotworks.comftSAkbiatleprntnagek-iAMineMeldAticlesiarice-8/2902rearo024 LIVEcommunity- MineMel to Extract Indicators From generic API LIVEcommunity - 218757
‘As shown in the last picture, the extracted indicators are of URL type and additional
attributes like city, region, country and temperature are attached to it.
Other optional configuration parameters supported by the CSVFT class are:
* fieldname: in case it to be null, then the values extracted from the first parsed lines
will be used as fieldnames (remember that one of the fields must be named
“indicator")
+ delimiter, doublequote, escapechar, quotechar and skipinitialspace control the
CSV parser behavior as described in the Python reference guide
Mining a HTML API
In this section you will be provided with steps needed to use the HTTPFT class to mine
dynamic data exposed in the response toa HTTP request (typically text/plain or
text/html). If you have not done so, please review the complete process described in the
section "Mining a CSV APF to understand concepts like "creating a new prototype",
“cloning a prototype as a working node’, etc.
To build a new HTTPFT class we first need base prototype that already leverages this
class. In this example we will use the prototype named dshield.blockas the base.
itpssitve paloaltonotworks.comftSAkbiatleprntnagek-iAMineMeldAticlesiarice-8/290 mr2irain0n
LIVEcommunity- MineMel to Extract Indleators From generic API-LIVEcommunity - 218757
Let's take a deeper look to the HTML API response to figure out how to generate a valid
prototype to accomplish our mission.
Request ->
GET /html HTTP/1.2
Host: test.mineneld.com
Response Headers <-
HTTP/2.0 208 OK
content-type: text/html
content-length: 1626
Response Body <-