Still can’t query big datasets #800
Description
I’m on InfluxDB version 1.7.9, and PythonClient 5.2.3.
I have a database that weights around 28GB, and I’m trying to query it from a Mac (10.12.6 OS if that’s relevant) with 16GB of Ram, using the Python Client (I use python 3.7).
I’ve been fighting with this issue for a week now, at the beginning I would get this error:
requests.exceptions.ChunkedEncodingError: (‘Connection broken: IncompleteRead(0 bytes read)’, IncompleteRead(0 bytes read))
Then I read issues number #450 #523 #531 #538 and #753, implemented the changes from issue #753, and now when I run the query Python simply gives me:
Process finished with exit code 137 (interrupted by signal 9: SIGKILL)
I was hoping that would turn my client.query in a generator that yielded every line one at a time every time it gets one, so I can process it, and then it empties the Ram and queries the next line. Basically streaming. "results" is now a generator, but that does not happen anyway.
This is my Python code:
client = InfluxDBClient(host='127.0.0.1', port='8086', username=‘x’, password=‘x’, database=‘x’)
q = 'SELECT * FROM “x”’
result = client.query(q, chunked=True).get_points()
print(“Query done “ + str(datetime.utcfromtimestamp(time.time())))
here I initialize an empty dataframe
for msg in result:
here i iterate through the results, process them and append them to the empty dataframe, and then I save the dataframe to a csv
the ‘x’ are for censorship
Thanks in advance