aws-s3-multipart + golden retriever #5180

theolundqvist · 2024-05-17T11:55:24Z

Initial checklist

I understand this is a bug report and questions should be posted in the Community Forum
I searched issues and couldn’t find anything (or linked relevant results below)

Link to runnable example

No response

Steps to reproduce

Similar to #4985, where the problem was "listParts". However, my "listParts" function is never run. Not even if I declare it in js in "use(AwsS3", so I guess it crashes before then.

Use multipart + golden retriever

I have implemented the companion endpoints for multipart uploads in laravel as shown below:

Everything works perfectly, except restoring the upload.

Upload 3GB file until 10%.
Refresh page.
Open uppy, (shows "we have restored all files")
CRASH

Expected behavior

restore all chunks.

Actual behavior

Crashes on reading size when restoring files.

Thanks for the help.

Mitchell8210 · 2024-05-17T18:06:18Z

I am experiencing the exact same problem with implementing uppy with the s3 multipart upload + golden retriever in an attempt to get it to resume the file upload where it left off after cutting the connection via a browser page refresh.
(I have been successful if i leave out the {serviceWorker: true} from the config, however that is not suitable for my requirements as it restarts the upload from the beginning if i lose the connection. )

Main difference between your implementation and mine is that i did not use companion or Tus because i really do not want to setup a server just to transfer the file to s3 in the end. I think it is more efficient and not overly complicated to achieve resumable uploads without the need of another server for the file to flow through before it goes to s3.

Thinking that this may not be resolved and may have to build a solution myself that mimics this library.

Murderlon · 2024-05-21T14:16:33Z

Thinking that this may not be resolved and may have to build a solution myself that mimics this library.

Probably better for the ecosystem and to save yourself some effort if you are willing to contribute rather than building it again. Would you be interested in taking a look?

Mitchell8210 · 2024-05-22T16:50:20Z

Wanted to post an update about this issue here. My co-workers have been able to take a crack at the implementation of Uppy with the golden retriever plugin without the flag for the webWorker and have been successful.

The main issue they were able to solve was an error regarding an invalid signature (s3 signing) when attempting to resume the upload from where it was left off. After this was resolved, it seems that the solution works the way we need it (resumable uploads without the need of another server and without the need of Companion or Tus)

My suspicion is that the invalid signature error was caused by a malformed listParts command on the server which is used when attempting to resume the upload.

Note: The Error posted at the beginning of this thread I believe is still valid.

The good thing is that it seems seems that the webWorker may not be needed in order to implement a client -> s3 upload solution with support for resumable uploading.

For reference, I had used the example for the nodejs-aws server endpoints that are in this file: https://github.com/transloadit/uppy/blob/main/examples/aws-nodejs/index.js

The problem code is in this part:

app.get('/s3/multipart/:uploadId', (req, res, next) => {
  const client = getS3Client()
  const { uploadId } = req.params
  const { key } = req.query


  if (typeof key !== 'string') {
    res.status(400).json({ error: 's3: the object key must be passed as a query parameter. For example: "?key=abc.jpg"' })
    return
  }

  const parts = []

  function listPartsPage (startAt) {
    client.send(new ListPartsCommand({
      Bucket: process.env.COMPANION_AWS_BUCKET,
      Key: key,
      UploadId: uploadId,
      PartNumberMarker: startAt,
    }), (err, data) => {
      if (err) {
        next(err)
        return
      }

      parts.push(...data.Parts)

      if (data.IsTruncated) {
        // Get the next page.
        listPartsPage(data.NextPartNumberMarker)
      } else {
        res.json(parts)
      }
    })
  }
  listPartsPage(0)
})

The solution that seems to work well is changing how the listParts is called, this is what I have come up with that seems to work (need to change the endpoint handler to be asynchronous) :


 let data = await s3Client.send(
   new ListPartsCommand({
     Bucket: AWS_BUCKET,
     Key: key,
     UploadId: uploadId,
     MaxParts: 1000,
   })
 );
 parts.push(...data.Parts);

 // continue to get list of all uploaded parts until the IsTruncated flag is false
 while (data.IsTruncated) {
   data = await s3Client.send(
     new ListPartsCommand({
       Bucket: AWS_BUCKET,
       Key: key,
       UploadId: uploadId,
       MaxParts: 1000,
       PartNumberMarker: data.NextPartNumberMarker,
     })
   );
   parts.push(...data.Parts);
 }

 res.json(parts);

I have submitted a PR to fix that issue in the aws-nodejs example to hopefully help others avoid the same error.

I hope this helps others who are trying to setup a similar solution using Uppy

*EDIT: I have realized that the solution I have explained here does not need to change the code so much, instead it is just a matter of conditionally passing in the PartNumberMarker into the config of the ListPartsCommand that is sent to s3.

I have also updated the PR to reflect this. here is what that endpoint handler looks like now with the update.

app.get('/s3/multipart/:uploadId', (req, res, next) => {
  const client = getS3Client()
  const { uploadId } = req.params
  const { key } = req.query

  if (typeof key !== 'string') {
    res.status(400).json({ error: 's3: the object key must be passed as a query parameter. For example: "?key=abc.jpg"' })
    return
  }

  const parts = [];

  function listPartsPage(startsAt = undefined) {
    let config = {
      Bucket: process.env.COMPANION_AWS_BUCKET,
      Key: key,
      UploadId: uploadId,
    };

    if (startsAt) config.PartNumberMarker = startsAt;

    client.send(new ListPartsCommand(config), (err, data) => {
      if (err) {
        next(err);
        return;
      }

      parts.push(...data.Parts);

      // continue to get list of all uploaded parts until the IsTruncated flag is false
      if (data.IsTruncated) {
        listPartsPage(data.NextPartNumberMarker);
      } else {
        res.json(parts);
      }
    });
  }
  listPartsPage();
})

theolundqvist added the Bug label May 17, 2024

aduh95 changed the title ~~uppy-multipart + golden retriever~~ asw-s3-multipart + golden retriever May 23, 2024

aduh95 changed the title ~~asw-s3-multipart + golden retriever~~ aws-s3-multipart + golden retriever May 23, 2024

aduh95 linked a pull request May 23, 2024 that will close this issue

updating aws-nodejs example listParts logic for resuming uploads #5192

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aws-s3-multipart + golden retriever #5180

aws-s3-multipart + golden retriever #5180

theolundqvist commented May 17, 2024

Mitchell8210 commented May 17, 2024 •

edited

Murderlon commented May 21, 2024

Mitchell8210 commented May 22, 2024 •

edited

aws-s3-multipart + golden retriever #5180

aws-s3-multipart + golden retriever #5180

Comments

theolundqvist commented May 17, 2024

Initial checklist

Link to runnable example

Steps to reproduce

Expected behavior

Actual behavior

Mitchell8210 commented May 17, 2024 • edited

Murderlon commented May 21, 2024

Mitchell8210 commented May 22, 2024 • edited

Mitchell8210 commented May 17, 2024 •

edited

Mitchell8210 commented May 22, 2024 •

edited