Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

install keras-nlp and keras-cv pre-release versions#1340

Merged
Philmod merged 6 commits intomainKaggle/docker-python:mainfrom
keras-nlp-cv-devKaggle/docker-python:keras-nlp-cv-devCopy head branch name to clipboard
Dec 12, 2023
Merged

install keras-nlp and keras-cv pre-release versions#1340
Philmod merged 6 commits intomainKaggle/docker-python:mainfrom
keras-nlp-cv-devKaggle/docker-python:keras-nlp-cv-devCopy head branch name to clipboard

Conversation

@Philmod
Copy link
Contributor

@Philmod Philmod commented Dec 11, 2023

Dockerfile.tmpl Outdated
@@ -549,7 +549,8 @@ RUN pip install flashtext \
keras-core \
# b/312946339 latest version not compatible with our version of keras
keras-cv==0.6.4 \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new dev version for keras-cv has also been released. It is 0.8.0.dev0.

@rosbo
Copy link
Contributor

rosbo commented Dec 11, 2023

The tests are failing because it tries to download the weights from kaggle.com but internet is disabled in our unit tests for reproducibility / speed / flakiness reasons.

The test sets the load_weights parameter to False which initialize the model with random weights. This was done to ensure it doesn't try to download the model weights from GCS.

Looks like the new package ignores that parameter... I will ask the keras_nlp folks about it.

load_weights=False, # load randomly initialized model from preset architecture with weights

@rosbo
Copy link
Contributor

rosbo commented Dec 11, 2023

Before, the "configuration" was checked-in. Now, it needs to load the config.json file over the internet and then it can it load the model with random weights. I guess that is expected now that the config doesn't live in the source code itself. I will think about how to fix this for our test.

@rosbo
Copy link
Contributor

rosbo commented Dec 12, 2023

@Philmod I pushed to your branch to add a fake API server for kagglehub given internet access is disabled in our smoke tests. I also updated the Dockerfile to install the dev version of keras-cv.

self.send_response(200)

def do_GET(self):
m = re.match("^/api/v1/models/(.+)/download/(.+)$", self.path)

Check failure

Code scanning / CodeQL

Polynomial regular expression used on uncontrolled data

This [regular expression](1) that depends on a [user-provided value](2) may run slow on strings starting with '/api/v1/models/a/download/' and with many repetitions of 'a/download/a'.
@Philmod Philmod changed the title install keras-nlp pre-release install keras-nlp and keras-cv pre-release Dec 12, 2023
@Philmod Philmod changed the title install keras-nlp and keras-cv pre-release install keras-nlp and keras-cv pre-release versions Dec 12, 2023
@Philmod
Copy link
Contributor Author

Philmod commented Dec 12, 2023

Test failure:

======================================================================
ERROR: test_fit (test_keras_nlp.TestKerasNLP)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/input/tests/test_keras_nlp.py", line 12, in test_fit
    classifier = keras_nlp.models.BertClassifier.from_preset(
  File "/opt/conda/lib/python3.10/site-packages/keras_nlp/src/models/task.py", line 213, in from_preset
    return super(cls, calling_cls).from_preset(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/keras_nlp/src/models/task.py", line 190, in from_preset
    tokenizer = load_from_preset(
  File "/opt/conda/lib/python3.10/site-packages/keras_nlp/src/utils/preset_utils.py", line 199, in load_from_preset
    tokenizer.load_assets(asset_dir)
  File "/opt/conda/lib/python3.10/site-packages/keras_nlp/src/tokenizers/word_piece_tokenizer.py", line 338, in load_assets
    self.set_vocabulary(path)
  File "/opt/conda/lib/python3.10/site-packages/keras_nlp/src/models/bert/bert_tokenizer.py", line 92, in set_vocabulary
    super().set_vocabulary(vocabulary)
  File "/opt/conda/lib/python3.10/site-packages/keras_nlp/src/tokenizers/word_piece_tokenizer.py", line 349, in set_vocabulary
    self.vocabulary = [line.rstrip() for line in file]
  File "/opt/conda/lib/python3.10/site-packages/keras_nlp/src/tokenizers/word_piece_tokenizer.py", line 349, in <listcomp>
    self.vocabulary = [line.rstrip() for line in file]
  File "/opt/conda/lib/python3.10/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 3793: ordinal not in range(128)

----------------------------------------------------------------------

@Philmod Philmod merged commit ff731ce into main Dec 12, 2023
@Philmod Philmod deleted the keras-nlp-cv-dev branch December 12, 2023 11:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.