Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Cortexutils extractor ip detection #199

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: develop
Choose a base branch
Loading
from
Open

Cortexutils extractor ip detection #199

wants to merge 2 commits into from

Conversation

srilumpa
Copy link
Contributor

Fix proposal for #198.

  • 1.0.0.127.localhost.localdomain. => does not match
  • 192.168.0.10 => does match
  • 192.168.0.1/24 => does match
  • 10.0.0.0/8 => does match
  • 10.0.0.0/ => does not match

@3c7 3c7 added category:bug Issue is related to a bug status:pr-submitted labels Feb 27, 2018
@3c7 3c7 self-requested a review February 28, 2018 08:28
@3c7
Copy link
Contributor

3c7 commented Feb 28, 2018

As the extractor should extract observables that aren't available as "a single line" also, line start and end markers (^ and $) are not really applicable here and with that something like 1.0.0.127.localhost.localdomain will be extracted as 1.0.0.127.

I need to think about how (if?) it's possible to distinguish between an IP in CIDR notation which is in-line and the mentioned "domain case".

@rolinh
Copy link

rolinh commented Feb 28, 2018

Even with the fix, 999.888.777.666 would match as a valid IPv4 address when clearly, it is not.

Why do you have to use regular expressions to match IP addresses? Why not use something like ip_address from the standard library?

def is_ip(s):
    try:
        ip_address(s)
        return True
    except ValueError:
        return False

In the end, there is not different types for IPv4 and IPv6 in TheHive, only ip.

Also, maybe it would be worth considering adding a network type to TheHive instead of having a range in CIDR notation considered as a valid IPv4 (and you guessed it, ip_network could be used to validate a network in CIDR notation).

@3c7
Copy link
Contributor

3c7 commented Feb 28, 2018

Why do you have to use regular expressions to match IP addresses?

Because the extractor should provide an easy way to retrieve observables from reports even if they are in-line and not explicit given. Analyzers do not have to use the extractor and can implement an own artifacts method.

Why not use something like ip_address from the standard library?

That would indeed be possible - after finding a possible IP address using regex.

@rolinh
Copy link

rolinh commented Feb 28, 2018

Why do you have to use regular expressions to match IP addresses?

Because the extractor should provide an easy way to retrieve observables from reports even if they are in-line and not explicit given. Analyzers do not have to use the extractor and can implement an own artifacts method.

I'm sorry, I don't understand why this makes using regular expressions a hard requirement.

Why not use something like ip_address from the standard library?

That would indeed be possible - after finding a possible IP address using regex.

Again, why? ip_address can take a string as input so you could feed it with the value to check just as you feed regexp.match. Of course, this would imply a minor refactoring(*) of the Extractor class but in the end you would obtain more reliable results.

(*) The changes would be minimal and only internal to the class.

EDIT: to be clear, I volunteer to implement such changes in a PR if you deem them worth.

@3c7
Copy link
Contributor

3c7 commented Feb 28, 2018

Again, why? ip_address can take a string as input so you could feed it with the value to check just as you feed regexp.match.

Yes, it takes a string, but cannot find addresses in a block of text. Again, the extractor should provide the functionality to "automagically" find observables/IoCs in strings which are not the observable itself, but a "wall of text".

Of course is recognizing single ip strings through regex is not the best way to achieve that. But you cannot guarantee that.

As this affects only analyzers run through MISP, it has a low prio for me until cortex 2 is released and the documentation is polished up as you're able to easily delete inappropriate IoCs (what has to be done anyway, as not every returned value is appropriate) in the overview after running the analyzer.

Copy link
Contributor

@3c7 3c7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently not applicable this way as observables cannot be found in free-text. Without the start and end markers it changes basically nothing. Would like to postpone this until Cortex 2 is released and documentation work is done. But thanks for the contribution.

@kx499-zz
Copy link
Contributor

I'm running into an issue where the IP's extracted (ipv4) include version strings and are not valid. I fixed this in code that calls cortex (external App) and basically filters the artifacts through the a call to IP Address. You could post process after the regex results are returned with ipaddress(val).is_global to limit out some of the noise.

@To-om To-om force-pushed the develop branch 3 times, most recently from fb8f5aa to 23be632 Compare July 29, 2020 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category:bug Issue is related to a bug scope:question
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.