-
-
Notifications
You must be signed in to change notification settings - Fork 8.1k
🐛 Don't prefill default values on form input #13464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
Hi @sneakers-the-rat, thanks for the PR first of all. About your proposal: while on one side this could offer the tools to handle what you asking at #13380, on the other side it will not address the more general issue #13399. As i said on the issue discussion, a good and more robusts solution could be harder and require more work than you think. Technical Comments: fastapi/fastapi/dependencies/utils.py Lines 727 to 729 in 643d284
I attach a small simple test that should pass but it will not. class SimpleForm(BaseModel):
foo: Annotated[str, Field(default="bar")]
alias_with: Annotated[str, Field(alias="with", default="nothing")]
@app.post("/simple-form")
def form_endpoint(model: Annotated[SimpleForm, Form()]) -> dict:
return model.model_dump()
def test_casted_empty_defaults(fastapi_client: TestClient):
form_content = {
"foo": "",
"alias_with": ""
}
response = fastapi_client.post("/simple-form", data=form_content)
response_content = response.json()
assert response_content["foo"] == "bar" # Expected :'bar' -> Actual :''
assert response_content["alias_with"] == "nothing" # ok Thanks for your contribution. |
Great, yeah i was relying on the tests to be protective of any behavior that needs to be preserved, so if there are things that would break that aren't tested then of course list them and we can add tests.
I am not sure how what you described there is more general than what i intitially proposed or have PR'd here, but ofc please let me know what needs to be fixed. This PR passes all the tests in #13399 except for The two points raised in that discussion were:
This is what the PR fixes - it is now possible to detect which fields have been explicitly set vs which are defaults.
the current code forces validation. the code in this PR fixes this as well, since the warning is a serialization, not a validation warning - again since the model is misspecified. The current code before this PR also fails this test. So if we want to make fixing that an additional bug we handle in this PR, i can do that, but in my opinion this is actually correct behavior for a misspecified model.
The change i have made only affects the The old code was if field.required:
return
else:
return deepcopy(field.default) The new code is if return_default and not field.required:
return deepcopy(field.default)
else:
return None Notice how when the default value of
This is not a breaking change, as this also fails in the current code, but again i can fix the handling of aliases as an additional bug if we want to attach it to this PR as well. |
Hello @sneakers-the-rat , i-m sorry, i have mixed tests, the following one is the right one for your case. app = FastAPI()
@pytest.fixture(scope="module")
def fastapi_client():
with TestClient(app) as test_client:
yield test_client
class SimpleForm(BaseModel):
foo: Annotated[str, Form(default="bar")]
alias_with: Annotated[str, Form(alias="with", default="nothing")]
@app.post("/simple-form")
def form_endpoint(model:Annotated[SimpleForm, Form()]) -> dict:
return model.model_dump()
def test_casted_empty_defaults(fastapi_client: TestClient):
form_content = {
"foo": "",
"with": ""
}
response = fastapi_client.post("/simple-form", data=form_content)
response_content = response.json()
assert response_content["foo"] == "bar" # Expected :'bar' -> Actual :''
assert response_content["alias_with"] == "nothing" # ok Right now is possible to use the given format and in some way, this is the one used for File and UploadFile.
With your implementation you are disabling any possible handling of fastapi/fastapi/dependencies/utils.py Lines 724 to 732 in 643d284
And this will allow original values to come back in fastapi/fastapi/dependencies/utils.py Lines 870 to 873 in 643d284
This is exactly what i reported in last test. |
…ests to demonstrate the problem, fixing in next commit)
…astapi into form-defaults
Ah i see. These lines: fastapi/fastapi/dependencies/utils.py Lines 870 to 873 in 643d284
seem incorrect: the purpose they seem to serve is to get any additional fields that are passed in the form body, but are not included in the form model. currently they are too general - the test is " It should actually be I added your test and now it passes. The core of the problem appears to be regarding empty string input, so i added an additional test that tests empty string input against all the possible ways of specifying a form. The docs show the ability to specify form data in two ways: @app.post("/login/")
async def login(username: Annotated[str, Form()], password: Annotated[str, Form()]):
return {"username": username} class FormData(BaseModel):
username: str
password: str
@app.post("/login/")
async def login(data: Annotated[FormData, Form()]):
return data (I actually can't find the example case you're using of specifying So when extracting the data, the current check of whether the field is a fastapi/fastapi/dependencies/utils.py Line 727 in 643d284
We also care about whether the containing field is a So see the test run for So I revised the prior impl semantics of To be clear, again - this is a bug in the current implementation, not introduced by this PR. The tests in The correct solution would probably be to remove all handling of special form cases from I would also like to add this to the form data docs since it is a basic part of handling form data with fastapi - how to handle missing values, checkboxes, etc. but i wanted to wait until we had general approval on this approach. lmk when we get there and i'll add docs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall i think this will solve a lot of issues related to FormFields with default, is now possibile to keep track if a field has been set or not. Is even possible to define a default placeholder that is not compatible with defined type. This option need to be well documented because default values are sent to the doc.
|
||
# preserve extra keys not in model body fields for validation | ||
for key, value in received_body.items(): | ||
if key not in values: | ||
if key not in field_aliases: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the right thing to do here =) and will solve even already existing bug, where None is returned =)
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
async with anyio.create_task_group() as tg: | ||
for sub_value in value: | ||
tg.start_soon(process_fn, sub_value.read) | ||
value = serialize_sequence_value(field=field, value=results) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably we need more tests. Here things changed because we are not returning defaults anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, name whatever tests need to be written. if the tests aren't already there to protect existing functionality, i'm not sure how i would know what should break. happy to write more tests but at some point we do need to limit the scope of this PR so it doesn't turn into "fix every problem with form handling"
will add docs tomorrow, i had already written a short example in the discussion, so i can improve that and include it in docs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposed solution resolves the mentioned bug and includes additional tests covering the expected behavior. I'm still somewhat unsure about the chosen names for certain parameters, as they might benefit from being more general, but I'll defer that judgment to the final review before merging. In particular, I'd like to highlight the behavior change regarding the exclusion of default values from validation error messages, which I believe is the correct choice. Overall, I think this is excellent work. Thanks for your contribution!
Happy to change any names, and also I will spend a bit of time this afternoon describing this behavior in the docs :) |
📝 Docs preview for commit cad08bb at: https://64e076c0.fastapitiangolo.pages.dev Modified Pages |
…astapi into form-defaults
📝 Docs preview for commit fec0a06 at: https://5ac52580.fastapitiangolo.pages.dev Modified Pages |
OK I added docs for this here: f63e983 While i was doing that, i realize i couldn't really write satisfying docs because the feature is arguably incomplete - We still needed a way to detect if we're being validated as a form (or being used elsewhere in the program) which was the whole root of the problem, so I added this into the field validation wrapper: self._type_adapter.validate_python(
value, from_attributes=True, context={"fastapi_field": self}
), where we just pass the fastapi the reason that was necessary is hopefully obvious from the added docs, because otherwise it would be impossible to add special validation behavior only when handling form data without using Previously I had proposed a special I recognize that is a late addition to the PR, but it is still directly related to the purpose of the pr (being able to handle form data when it differs from e.g. json or model use within python). I can remove it if we don't like that. edit: by the way! very cool docs setup. I tried to add pydantic v1 examples but it doesn't look like they're showing up in the docs preview, not sure what i am doing wrong there but they are in the docs src |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sneakers-the-rat, @luzzodev, thanks for working on this!
I think to get this merged we need to split these changes into several PRs.
First, we need to fix the issue with the default values being passed as input values to Form model (that causes the inability to check which fields were set).
I would focus on this main issue in this PR and when it's merged then we can open new PRs to add docs about quirks with form-encoded data.
I haven't properly reviewed tests yet. As I see it now, we basically need to make the following pass (in details):
from typing import Annotated
from fastapi import FastAPI, Form
from fastapi.testclient import TestClient
from pydantic import BaseModel
class ExampleModel(BaseModel):
field_1: bool = True
app = FastAPI()
@app.post("/form")
async def form_endpoint(model: Annotated[ExampleModel, Form()]):
return {
"fields_set": list(model.model_fields_set),
"field_1": model.field_1,
}
client = TestClient(app)
def test_form_model_fields_set_empty():
resp = client.post("/form", data={})
assert resp.status_code == 200, resp.text
resp_json = resp.json()
assert resp_json["fields_set"] == []
assert resp_json["field_1"] is True
fastapi/_compat.py
Outdated
self._type_adapter.validate_python( | ||
value, from_attributes=True, context={"fastapi_field": self} | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not related to the main purpose of this PR, I suggest to remove this from this PR.
Also, I think this is not necessary to add the context here at all.
To understand whether the model is used for Form data or JSON data, you can create new model, inherit it from existing model and add the validator to this new model only. Then use this new model for Form data.
See code example in details:
from typing import Annotated
from fastapi import FastAPI, Form
from fastapi.testclient import TestClient
from pydantic import BaseModel, model_validator
class ExampleModel(BaseModel):
field_1: bool = True
class ExampleFormModel(ExampleModel):
@model_validator(mode="before")
def validate_inputs(cls, values: dict) -> dict:
if values.get("field_1") is None:
values["field_1"] = False
return values
app = FastAPI()
@app.post("/form")
async def form_endpoint(model: Annotated[ExampleFormModel, Form()]):
return {
"fields_set": list(model.model_fields_set),
"field_1": model.field_1,
}
client = TestClient(app)
def test_form_model_fields_empty_is_false():
resp = client.post("/form", data={})
assert resp.status_code == 200, resp.text
resp_json = resp.json()
assert resp_json["field_1"] is False
def test_form_model_fields_explicit_true():
resp = client.post("/form", data={"field_1": True})
assert resp.status_code == 200, resp.text
resp_json = resp.json()
assert resp_json["field_1"] is True
def test_form_model_fields_explicit_false():
resp = client.post("/form", data={"field_1": False})
assert resp.status_code == 200, resp.text
resp_json = resp.json()
assert resp_json["field_1"] is False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, but the whole underlying purpose is that we want to be able to use models in a way that is agnostic to the input type (form, json, etc.). We could just make different models for every endpoint, and that would avoid the entire problem, but what we want is to be able to reuse models within the program as well as for form input so that we can avoid making a million separate models. the change here is extraordinarily minimal, is general-purpose (there are probably other circumstances when we would want to access the fastapi field within a validator), and fully backwards compatible.
I can remove this from this PR, but it means that we can't really document the changes here in a way that makes sense - "you want to be able to handle quirks of form data... there's actually no satisfying way to do that, but here is an awkward hack that involves duplicating your model classes..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the change here is extraordinarily minimal, is general-purpose (there are probably other circumstances when we would want to access the fastapi field within a validator), and fully backwards compatible.
This is not necessary for the purpose of this PR, right? You can open new PR for this.
My idea is: the less unnecessary changes this PR contains, the more chances it will be approved by Sebastián and merged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the amount of labor it takes to maintain a widely used package, so my full sympathies there, and i don't mean to be harsh, but i would be much more willing to split things off into a separate PR if PRs moved with any velocity in this package.
If i copy just the text parts of all my comments from the original discussion that were required to justify raising an issue, then comments in the issue justifying the change, and then comments here, i'm at >3,600 words over five months. That's all for a PR that changes seven (7!!!!!) lines of code in the package to fix something that is inarguably a bug in the mismatch between the handling of form encoded data and model annotations. If i have to go through all that again to change literally a single line, the odds i actually go through with that are almost 0 - i've already adapted my code to just never use form encoded data because this took so long, so it no longer affects me, i'm just trying to improve the state of the package for everyone since i was already started.
Again this change is necessary for the purpose of this PR: ideally fastAPI would just handle form data reasonably and use the information it already has from the type annotations to handle inconsistency between form encoded data and python types (where, e.g. booleans are encoded as false -> not present, true -> "on"), but since i was waved down from making it actually work well, the fallback solution that touched the least amount of code was to be able to detect when a model is being validated as a form vs when it's being validated from json or used in the python code. Without this change, it would still be impossible to handle the inconsistencies between form encoding and json (the root of the problem!) because we would be unable to tell when a model was being used with form input vs. other contexts. There would be effectively nothing to document, and no bug would be fixed, because the docs would just go "handling defaults in forms is weird, but there is no solution, sorry! just make more models despite using your models everywhere from the API through the database being one of the main features of fastAPI!" I could have made this more relevant to the PR by just passing {"fastapi_form": True}
in the context, but that would be far, far less broadly useful, so my approach here was trying to both address the immediate needs of the PR while providing additional useful functionality, which is the source of it appearing unrelated to the PR.
Along the way I proposed several other, imo better resolutions to that problem (e.g. a specific @form_validator
decorator, "just handling it correctly"), but since the whole process is designed to whittle down contribs to as few lines as possible, those better options got pruned and this is what we have left. I've received extensive input from @luzzodev (thanks again!) and now from you, but it seems like in both cases none of us are really sure what would be permitted, and y'all don't seem to have discretionary power to make calls about what should go and what shouldn't. so it seems like the result of a process that seems to be intended to spare Sebastián from having to waste time triaging a bunch of junk issues and PRs (again! all sympathy there, totally understood how much work that must be) seems to actually be producing a very low volume of half-measure PRs because decisionmaking power isn't actually being delegated to y'all who are longterm maintainers, and people who want to contribute fall out of the process or accept half-measures at each step of the funnel from the forced discussion through PR process. The velocity of non-docs, non-bot PRs in a project that is this widely used i think speaks to that bottleneck.
So anyway. this has already been a very long journey for such a tiny tiny bugfix. I could move this to another PR if it wouldn't take another 5 months to go through the whole process, but pre-emptively removing changes that appear to be off-topic because the single person with decisionmaking power will only have 5 milliseconds to review the PR seems like a pretty maladaptive dev process to me, and i would probably just not bother and abandon the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally understand your frustration about long time waiting for this to be reviewed.
On the other side, I know that sometimes even tiny changes may introduce more bugs than fix.
Also, it took me more than 1h to go through all comments (issue, discussion and this PR) to make sure I understand the problem and don't miss anything. Off course it will take much less time for Sebastián, but it's still not "5 milliseconds" and it's something that can be improved from our side.
Let's stop creating even more comments. :)
I'll find time to open an alternative version of this PR (anybody else who is willing to contribute, feel free to take this over) and let Sebastián decide which one to review and merge.
Thanks again for your contribution!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, right, the process needing to loop through the long route it did makes it hard to review in addition to hard to contribute, and needing to have a bunch of people retrace that context instead of the maintainer (luzzo) who followed it being able to make a call multiplies that. splitting off to competing approaches is demotivating in a similar way, where if i am trying to contribute and push something through a long process and then at the end my contribs just get dropped in favor of an internal contributor duplicating my work, i wonder why bother. in both cases it's the indeterminacy of "here is a diagnosed problem, but we don't really know what kind of fix could get merged" that is causing lots of extra work.
like i said, if what you're saying is "in order for this PR to be merged, split it into two, and then we could review independently and merge them together as different parts of addressing the same problem" then i would happily do it right now. but if the request is "split this PR into two to double the process, that might make this incomplete PR mergeable, and then 5 months later maybe the thing that finally resolves the issue can be merged" then that's when I write comments like the above rather than posting 🫡 and opening the pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in favor of an internal contributor duplicating my work
I think this part is not true. I'm not going to duplicate your work. I'm going to take a part of your work (I will off course point to your PR as a source of this changes and explain why I created new PR), review it carefully, revise tests and probably add others, add concise description.
I think it's normal for opensource. This is how great things get created.
If somebody else is willing to do this, It will be even better for me!
Yes, unfortunately I can't predict the decisions of Sebastián yet. But, to make this PR reviewed by Sebastián, someone who is responsible for sorting PRs should take the responsibility to approve it and send to Sebastián for review. In current state I'm not ready to take this responsibility, sorry..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alright in that case, all yours, i'm just going to fork.
|
||
# preserve extra keys not in model body fields for validation | ||
for key, value in received_body.items(): | ||
if key not in values: | ||
if key not in field_aliases: |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
If we want to get this reviewed by Sebastián and merged, our task is to prepare this PR so that the reviewing process would be as simple as possible for him. I think for this we need to:
|
@sneakers-the-rat & @luzzodev: thanks so much for your time and effort put into this. I also understand your frustration with the PR reviewing process. As you mentioned, it really is a huge task going through all the PRs in detail. Considering the amount of users of the library, we are very wary of introducing new bugs with new features, or even with bug fixes, and so we really do need to be very careful whenever we merge any PR. That said, I agree that the process should be more fluent especially when it comes to bug fixes. I also want to say I really appreciate how much work you've both put into creating solid tests etc, this is really crucial for the code quality. With respect to Sebastián reviewing the "final" PR (whenever e.g. Yurii or I sign off of it), I can assure you he would take a lot more than 5 milliseconds to go over it. In fact, he is very meticulous in his reviewing process and goes through every detail, thinks of every edge case and often finds issues even when I for instance thought it was 100% good to merge. He knows this library inside-out like no one else, and that's why his final review is so important. But at the same time we try to make his job as efficient as possible to have each PR address only one atomic fix/feature. Hence Yurii's requests. Anyway let's try and keep the momentum for this PR, as it would be unfortunate to lose this work. Ideally, we should try and work off the original PR to ensure the original contributor gets recognition as well. @YuriiMotov: do you have time to work on this? |
appreciate everyone has limited time, and i am not trying to be a burden or ornery about having things my way, but i still maintain that the change that's being objected to is directly related to, and necessary to resolve the underlying issue for this PR. The change is this single line that gives the outer type to the validation context as The reason this is core to the purpose of this PR is pretty simple:
if we don't have (c2) then we don't solve the problem - we might be able to tell when a value with a default is missing, but that doesn't really matter because there's no way to distinguish form data from other contexts like in-program validation or validation from JSON! Again, I had proposed several other means of accomplishing this like a
I wrote tests that covered exactly the changes in the PR, and through review was asked to add several more. The docstrings in the tests describe what they are testing, and hopefully the parameterization is obvious enough. lmk which tests are unnecessary or how you would simplify them, but i am not seeing anything unnecessary there reading again.
i did update the description as requested. also let me know what needs to be updated there because to me it just describes the changes and provides links to context.
sure if closing and reopening the same PR to clear its history is a thing that's done around here, i can do that, no problem. |
I already suggested a way to do this without adding context: #13464 (comment) I know you didn't like it, but "there's no way to distinguish ..." is not true |
One could also directly parse the binary from the if this is actually the way all the maintainers want things to work, i'm sort of worn down after 6 months, so I removed the one controversial line |
This comment was marked as outdated.
This comment was marked as outdated.
📝 Docs preview for commit 1b8d0d7 at: https://c3a274c0.fastapitiangolo.pages.dev Modified Pages |
This pull request has a merge conflict that needs to be resolved. |
Fix: #13399
Discussed in: #13380
summary
Context in above issue and discussion.
This PR makes JSON and Form data behave consistently, not prefilling default values but instead allowing the model to do that so that model validators are able to differentiate between unset and default values.
motivation
This is particularly important for form encoded data, as some fields like checkboxes do not have behavior that would be expected given the model type, and so it becomes impossible to decouple the semantics of the model (where it is entirely reasonable to have defaults for boolean fields, e.g. if one was programmatically generating a form from a model, one would use the default to set the initial state of a checkbox) from form parsing.
impl notes
I did not include all the test cases from #13399 since many of them were just testing pydantic model behavior which seems out of scope for fastapi to test, but i did make use of some of the ideas from there (thanks for getting this started @luzzodev !!!), and it should be easy to add them in if we want to have them.
edit: re: this comment requesting clarification on changes to _extract_form_body,
see this comment explaining the problem and my comment explaining the resolution.
values
is populated in two stages - first any values inreceived_body
that have defined annotations from a model/field (body_fields
) are extracted using the annotation, and second any values that do not have a corresponding annotation are added as-is for downstream validation. Previously, the first stage would populatevalues
with defaults, and so in the second stage we could check for "extra" values withif key not in values
. However since now we are not prepopulating the defaults (the point of the PR), in the case of empty strings (a form-encoding artifact),if key not in values
would just add them back in (defeating the whole point of the PR), and so we needed a different method of preserving thereceived_body
fields without accompanyingbody_fields
annotations. So the change is incredibly minor - we just get a set of field aliases and check against that, rather than checking against what is already invalues
. this is a narrowly focused change that should only make it so we don't include empty strings invalues
when processing form input.future directions
This is the most minimal form of the proposed changes, which allows someone to use a
model_validator
inbefore
mode to differentiate unset and default values.A more complete fix would be to a) make a
@form_validator
that receives the raw form input before normal model validation, and b) if no@form_validator
is declared for a model, use the field annotation information to infer how to process form encoded data (e.g. for abool
field, if the key is absent in the input, interpret asFalse
. for alist[str]
field, treat as an empty list, etc.). I would be happy to continue on and implement these improvements, which would make form handling a lot more explicit, but i'll hold here until i get approval bc this also is sufficient for my purposes for now.