Comments (21)
Hi @bdsmith48,
I just tried this and it worked just fine here - only thing is that print doesn't make it as nice to read as pprint but that's just a minor thing. A couple of questions to see if I can help you further.
- What version of Python are you using?
- What os are you running this on?
I've found that depending on the host (RAM, CPU etc) and size of logfile it can take some time to read in the file. If this is your own file - what size is the file/how many lines does it contain?
Have you tried any other file that are bundled with Bat in the data subdirectory?
Cheers, Mike
from zat.
I've tried it on windows with python 2.7.6 and on a Mac but I can't currently check the version I was using on the Mac
from zat.
I don't think I got 2.7.6 installed anywhere at the moment so can't replicate right now.
If you just set it to read in the log, does it ever finish - if you run it in the interactive Python shell for example?
Have you tried other logs than the ssh.log?
from zat.
@bdsmith48 @swedishmike My guess here is that ssh.log is an empty file or something. @bdsmith48 please try this with the files included in this repository under data/
from zat.
is there a reason why this wouldn't work with other bro log files? I know my files aren't empty and they aren't extremely long. It turns out that it did work for your log files though.
from zat.
I've used it with log files from quite a number of various installations - as well as some publicly available ones where I don't even know which versions of Bro they were generated so it should work with yours too.
Three quick questions...
- What version of Bro are you running?
- Have you got any custom fields added to your logs?
- Depending on the sensitivity of your logs - would you be able to share them? Possibly changing IP addresses to make them less specific?
Cheers, Mike
P.S
Just another, really silly, question - you are running Bro with the default, tab separated, logs and not Json logs?
from zat.
@swedishmike good point about the json logs. @bdsmith48 if you have json logs they are currently not supported but on our todo list. See #40
from zat.
@bdsmith48 Are you still having issues with this or did you get it figured out?
from zat.
I mailed same error to [email protected] but it was responded that there is no such mail.
I saved a 2k row file as bro. Then i am tyring to load it to data frame with command below. My problem is nothing happens in 1.5 hour and cpu is always %100. What can cause this? Any mismatch between Bro versions? What version of Bro is required for Bat? I am using python 3.6. btw
bro_df = LogToDataFrame('/home/seckindinc/Desktop/Projects/Bro/bro')
from zat.
@seckindinc When you say that you 'saved it as bro' what do you mean?
Would it be possible for you to share this file so that I or @brifordwylie could give it a go in our environments?
I run Python 3.6 as well and have used logfiles from Bro 2.5.x without any problems.
from zat.
I tried to mean that i saved a small portion of raw log file under the name of "bro". I can't share log with you because of confidentiality. Can you share your Bro version or parser?
from zat.
Did you leave in the headers etc in the file?
Are your logs in tab separated format or JSON?
Which log file are you trying to parse at the moment? Conn, ssl, http etc?
Also, can you parse the example files that comes with Bat?
I'm running Bro v 2.5.1 and 2.5.2 on various machines at the moment.
from zat.
There is no head in file.
It is tab separated.
I have multiple types of Bro logs. I am assuming that Bro automatically parse this?
Didn't try yet. I will soon.
from zat.
I have multiple types of Bro logs. I am assuming that Bat* automatically parse this?
from zat.
I have done with your examples. I think i need to check my log files for format issues.
from zat.
I think I might have replicated your issue. I removed the headers from one of the test files and now it doesn't load properly.
I'm in the middle of some Christmas celebrations here so can't fully verify in the code right now but my guess is that the headers are used to verify what file it is that's being opened and what fields it contains. I'm sure @brifordwylie can confirm whether or not this is true once he sees these messages and have a minute to spare over the holidays.
If you want to test you can take the headers from your original log file and add them to your exported one.
Just to confirm - what I call headers are the following lines, this example is from the dns.log file:
#separator \x09
#set_separator ,
#empty_field (empty)
#unset_field -
#path dns
#open 2014-04-03-10-08-27
#fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p proto trans_id query qclass qclass_name qtype qtype_name rcode rcode_name AA TC RD RA Z answers TTLs rejected
#types time string addr port addr port enum count string count string count string count string bool bool bool bool count vector[string] vector[interval] bool
Please let me know if that makes any difference.
from zat.
When i remove every detail except data, it didn't work for me either. I think bat requires column names and product type.
from zat.
So if you leave the lines starting with # from the top of the file it works?
As I said in the previous comment - I think this is what's used to ascertain what file it is as well as get the field names and so on.
from zat.
Thank you so much for your help. This package works great if we give detailed info about the log.
from zat.
@seckindinc No worries at all - I'm glad I could help you.
@bdsmith48 - Just to check - could this be the solution to your issue too?
from zat.
@swedishmike @seckindinc Yes, the reader reads in the Bro Headers. All bro versions should be putting out headers on the files. If you cut/paste some of the rows into another file you'll need to include the headers as well.
I'm going to close this ticket, If @bdsmith48 wants to reopen we will.
from zat.
Related Issues (20)
- what's your zeek scripts , can you share for me
- Error installing with python3.9 and numpy 1.19 HOT 2
- Add flake8 check to examples/
- Setup coveralls in Git Actions HOT 1
- Make pyspark dependency optional HOT 2
- Support for Intelligence Framework HOT 7
- How to make sure the dataframe_to_matrix function perform same on the data with same structure? HOT 3
- ZAT may ignore contents after character '#' HOT 8
- Demo file has errors HOT 1
- http.log file with the size of 2.0 MB zat stuck HOT 1
- Unable to run the code HOT 3
- Stuck when attempting to make a Pandas DF HOT 4
- Loading Data from S3 HOT 2
- Can we make JSON a first class citizen? HOT 2
- Multiple log files into a single dataframe HOT 1
- dosn't work
- About no output HOT 3
- Zeek to Parquet With Spark Gives Different Timestamp HOT 11
- Library issues -any suggestions HOT 3
- Bug - Dataframe_to_Matrix.py file using deprecated Pandas method HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from zat.