The Wayback Machine - https://web.archive.org/web/20230128074829/https://github.com/symfony/symfony/issues/49014
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dom-crawler + browser-kit empty file upload error "Path cannot be empty" #49014

Open
rudiedirkx opened this issue Jan 17, 2023 · 5 comments
Open

Comments

@rudiedirkx
Copy link

rudiedirkx commented Jan 17, 2023

Symfony version(s) affected

6.0.11

Description

Using Symfony's dom-crawler and browser-kit (from Goutte) I load a simple page with a simple form: https://webblocks.nl/_form1.html and submit it without giving a file. I would expect an empty file to be uploaded (like a real browser does), or no file to be uploaded at all, but instead browser-kit crashes because it tries to read the no-file like it's a real file.

I thought it had something to do with the special/nested file input name: files[b] instead of file, but they both break! Special/nested file input name in this form: https://webblocks.nl/_form2.html both it breaks the same exact way.

How to reproduce

composer require fabpot/goutte

<?php
require 'vendor/autoload.php';
$goutte = new \Goutte\Client();
$crawler = $goutte->request('GET', 'https://webblocks.nl/_form1.html'); // OR:
// $crawler = $goutte->request('GET', 'https://webblocks.nl/_form2.html');
$form = $crawler->selectButton('Submit it')->form();
$crawler = $goutte->submit($form, [
	'values' => [
		'text' => 'Oele',
	],
]);

Error:

ValueError: Path cannot be empty

From vendor/symfony/mime/Part/DataPart.php:68:

if (false === $handle = @fopen($path, 'r', false)) {
    throw new InvalidArgumentException(sprintf('Unable to open path "%s".', $path));

It doesn't crash on the InvalidArgumentException, but the ValueError from fopen I assume. But that's not the point.

dom-crawler gives browser-kit an empty file, because it sees a file input, and the file is never filled, because I only submit values[text]=Oele


I'm using the Goutte client, because it's an easy package, but I assume it works with just dom-crawler and browser-kit.

Possible Solution

The best solution would be to do exactly what the browser does: send a non-file. I have no idea how that works. Somewhere along the process browser-kit knows the file is empty:

^ array:1 [▼
  "file" => array:5 [▼
    "name" => ""
    "type" => ""
    "tmp_name" => ""
    "error" => 4
    "size" => 0
  ]
]

but it still tries to read it etc.

The second best solution would be to just completely ignore the file and not send it at all. dom-crawler's Form::getFiles() could filter on emptiness.

Additional Context

The full browser-kit Request just before it all breaks:

^ Symfony\Component\BrowserKit\Request {#686 ▼
  #uri: "https://webblocks.nl/_http_server.php"
  #method: "POST"
  #parameters: array:2 [▼
    "op" => "Submit it"
    "values" => array:1 [▼
      "text" => "Oele"
    ]
  ]
  #files: array:1 [▼
    "file" => array:5 [▼
      "name" => ""
      "type" => ""
      "tmp_name" => ""
      "error" => 4
      "size" => 0
    ]
  ]
  #cookies: []
  #server: array:4 [▼
    "HTTP_USER_AGENT" => "Symfony BrowserKit"
    "HTTP_REFERER" => "https://webblocks.nl/_form1.html"
    "HTTP_HOST" => "webblocks.nl"
    "HTTPS" => true
  ]
  #content: null
}

I feel like I must be doing something wrong, because this is such an obvious bug, but I'm not doing anything weird, just load a form and submit it without a file. Has nobody ever done that? How can this be a bug?

@xabbuh
Copy link
Member

xabbuh commented Jan 18, 2023

Can you create a small example application that allows to reproduce your issue?

@rudiedirkx
Copy link
Author

The How to reproduce is all you need. That's it.

mkdir goutte
cd goutte
composer require fabpot/goutte
nano test.php # put that code in
php test.php

PHP Fatal error: Uncaught ValueError: Path cannot be empty in goutte/vendor/symfony/mime/Part/DataPart.php:68

@ging-dev
Copy link
Contributor

The How to reproduce is all you need. That's it.

mkdir goutte
cd goutte
composer require fabpot/goutte
nano test.php # put that code in
php test.php

PHP Fatal error: Uncaught ValueError: Path cannot be empty in goutte/vendor/symfony/mime/Part/DataPart.php:68

The file field is not filled in but in principle it must be filled in, or removed like this:

<?php
use Goutte\Client;

require __DIR__.'/vendor/autoload.php';

$client = new Client();

$crawler = $client->request('GET', 'https://webblocks.nl/_form1.html');

$form = $crawler->filter('form')->first()->form();

// need to remove file field from form
$form->remove('file');

$crawler = $client->submit($form, [
    'values' => [
        'text' => 'Hi bro'
    ]
]);

echo $crawler->html();

@rudiedirkx
Copy link
Author

Why? A browser doesn't do that. A web crawler should take of that automatically, not the dev manually. I don't always know if the form I'm submitting has file fields. Is it the developer's responsibility to crawl the form for file fields before submitting? That makes dom-crawler + browser-kit a whole lot less useful/easy to use.

IF removing it is the way to go, dom-crawler or browser-kit should do that automatically IMO, BUT I don't even think that it's the way to go. A real browser just sends an empty file. Symfony should too.

@rudiedirkx
Copy link
Author

rudiedirkx commented Jan 25, 2023

Whatever the solution is, this very simple code shouldn't break the way it does:

$form = $crawler->selectButton('Submit it')->form();
$crawler = $goutte->submit($form, [
	'values' => [
		'text' => 'Oele',
	],
]);

That's example code. That can't become an uncaught exception somewhere deep in Mime. I don't know where the bug is, but it has to be a bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.