Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Watch instances not getting garbage collected #985

Copy link
Copy link
@efroemling

Description

@efroemling
Issue body actions

Environment details

  • OS type and version: Ubuntu 24.04.1 LTS
  • Python version: Python 3.12.3 (main, Nov 6 2024, 18:32:19) [GCC 13.2.0]
  • pip version: pip 24.3.1
  • google-cloud-firestore version: 2.19.0

I'm working on a Python server app which creates/destroys a decent number of document listeners as part of its operation, and have noticed some of my objects being kept alive longer than I'd expect. I have traced this down to Firestore Watch objects hanging around holding onto callbacks for a long time after I've unsubscribed and released all references to them.

I've put together a minimal repro case which demonstrates a reluctant-to-die Watch object, and also a workaround example showing how clearing a few internal fields after an unsubscribe call seems to break cycles or whatnot and allow it to go down immediately.

Steps to reproduce

  1. Run the test_watch_cleanup() call from the code below. It takes a Client instance, creates a Watch, unsubscribes the Watch a few seconds later, and finally spins off a thread holding only a weak-ref and waits for the Watch to be garbage collected. In my case this results in the Watch living on indefinitely or at least for a long while.
  2. In the same example, flip WORKAROUND to True and run it again. In my case this results in the Watch object going down immediately once no longer in use.

Code example

import time
import functools
import weakref
import threading
from typing import Any

import google.cloud.firestore_v1

WORKAROUND = False

def test_watch_cleanup(client: google.cloud.firestore_v1.Client) -> None:
    """Test Watch object garbage collection"""

    # Create a ref to any document (doesn't need to exist).
    doc_ref = client.collection('some_collection').document('some_doc_id')

    # Start listening for changes.
    doc_watch = doc_ref.on_snapshot(on_snapshot)

    # Give it a moment.
    time.sleep(3.0)

    if WORKAROUND:
        # Need to grab these here before unsubscribe clears them.
        rpc = doc_watch._rpc
        consumer = doc_watch._consumer

    # Stop listening for changes.
    doc_watch.unsubscribe()

    if WORKAROUND:
        # From looking at which references were keeping doc_watch alive,
        # I found that clearing these fields after the unsubscribe allows
        # it to go down cleanly.
        rpc._initial_request = None
        rpc._callbacks = []
        consumer._on_response = None
        doc_watch._snapshot_callback = None

    # Lastly, kick off a thread to keep an eye on doc_watch until it dies.
    threading.Thread(
        target=functools.partial(watch_the_watcher, weakref.ref(doc_watch))
    ).start()

def on_snapshot(*args: Any) -> None:
    """Listener callback"""
    print('got snapshot')

def watch_the_watcher(doc_watch_weak_ref: weakref.ReferenceType) -> None:
    """Spin and wait for doc_watch to die."""

    starttime = time.time()
    while doc_watch_weak_ref() is not None:
        print(f'doc_watch is still alive ({time.time()-starttime:.0f}s)')
        time.sleep(5.0)
    print('doc_watch IS DEAD!!!')

Results

With WORKAROUND=False I get:

got snapshot
WARNING:google.api_core.bidi:Background thread did not exit.
doc_watch is still alive (0s)
INFO:google.api_core.bidi:Thread-ConsumeBidirectionalStream exiting
doc_watch is still alive (5s)
doc_watch is still alive (10s)
doc_watch is still alive (15s)
doc_watch is still alive (20s)
doc_watch is still alive (25s)
<etc etc>

With WORKAROUND=True I get:

got snapshot
WARNING:google.api_core.bidi:Background thread did not exit.
doc_watch is still alive (0s)
INFO:google.api_core.bidi:Thread-ConsumeBidirectionalStream exiting
doc_watch IS DEAD!!!

Curious if others get the same results.
If so, would it make sense to add something similar to my workaround code as part of unsubscribe() or whatnot to allow these things to go down more smoothly?

Metadata

Metadata

Assignees

Labels

api: firestoreIssues related to the googleapis/python-firestore API.Issues related to the googleapis/python-firestore API.priority: p2Moderately-important priority. Fix may not be included in next release.Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    Morty Proxy This is a proxified and sanitized view of the page, visit original site.