Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws-s3-multipart + golden retriever #5180

Open
2 tasks done
theolundqvist opened this issue May 17, 2024 · 3 comments · May be fixed by #5192
Open
2 tasks done

aws-s3-multipart + golden retriever #5180

theolundqvist opened this issue May 17, 2024 · 3 comments · May be fixed by #5192
Labels

Comments

@theolundqvist
Copy link

Initial checklist

  • I understand this is a bug report and questions should be posted in the Community Forum
  • I searched issues and couldn’t find anything (or linked relevant results below)

Link to runnable example

No response

Steps to reproduce

Similar to #4985, where the problem was "listParts". However, my "listParts" function is never run. Not even if I declare it in js in "use(AwsS3", so I guess it crashes before then.

Use multipart + golden retriever
CleanShot 2024-05-17 at 13 50 16
I have implemented the companion endpoints for multipart uploads in laravel as shown below:
CleanShot 2024-05-17 at 13 52 58

Everything works perfectly, except restoring the upload.

  1. Upload 3GB file until 10%.
  2. Refresh page.
  3. Open uppy, (shows "we have restored all files")
  4. CRASH

Expected behavior

restore all chunks.

Actual behavior

image
image

Crashes on reading size when restoring files.

Thanks for the help.

@Mitchell8210
Copy link

Mitchell8210 commented May 17, 2024

I am experiencing the exact same problem with implementing uppy with the s3 multipart upload + golden retriever in an attempt to get it to resume the file upload where it left off after cutting the connection via a browser page refresh.
(I have been successful if i leave out the {serviceWorker: true} from the config, however that is not suitable for my requirements as it restarts the upload from the beginning if i lose the connection. )

Main difference between your implementation and mine is that i did not use companion or Tus because i really do not want to setup a server just to transfer the file to s3 in the end. I think it is more efficient and not overly complicated to achieve resumable uploads without the need of another server for the file to flow through before it goes to s3.

Thinking that this may not be resolved and may have to build a solution myself that mimics this library.

@Murderlon
Copy link
Member

Thinking that this may not be resolved and may have to build a solution myself that mimics this library.

Probably better for the ecosystem and to save yourself some effort if you are willing to contribute rather than building it again. Would you be interested in taking a look?

@Mitchell8210
Copy link

Mitchell8210 commented May 22, 2024

Wanted to post an update about this issue here. My co-workers have been able to take a crack at the implementation of Uppy with the golden retriever plugin without the flag for the webWorker and have been successful.

The main issue they were able to solve was an error regarding an invalid signature (s3 signing) when attempting to resume the upload from where it was left off. After this was resolved, it seems that the solution works the way we need it (resumable uploads without the need of another server and without the need of Companion or Tus)

My suspicion is that the invalid signature error was caused by a malformed listParts command on the server which is used when attempting to resume the upload.

Note: The Error posted at the beginning of this thread I believe is still valid.

The good thing is that it seems seems that the webWorker may not be needed in order to implement a client -> s3 upload solution with support for resumable uploading.

For reference, I had used the example for the nodejs-aws server endpoints that are in this file: https://github.com/transloadit/uppy/blob/main/examples/aws-nodejs/index.js

The problem code is in this part:

app.get('/s3/multipart/:uploadId', (req, res, next) => {
  const client = getS3Client()
  const { uploadId } = req.params
  const { key } = req.query


  if (typeof key !== 'string') {
    res.status(400).json({ error: 's3: the object key must be passed as a query parameter. For example: "?key=abc.jpg"' })
    return
  }

  const parts = []

  function listPartsPage (startAt) {
    client.send(new ListPartsCommand({
      Bucket: process.env.COMPANION_AWS_BUCKET,
      Key: key,
      UploadId: uploadId,
      PartNumberMarker: startAt,
    }), (err, data) => {
      if (err) {
        next(err)
        return
      }

      parts.push(...data.Parts)

      if (data.IsTruncated) {
        // Get the next page.
        listPartsPage(data.NextPartNumberMarker)
      } else {
        res.json(parts)
      }
    })
  }
  listPartsPage(0)
})

The solution that seems to work well is changing how the listParts is called, this is what I have come up with that seems to work (need to change the endpoint handler to be asynchronous) :


 let data = await s3Client.send(
   new ListPartsCommand({
     Bucket: AWS_BUCKET,
     Key: key,
     UploadId: uploadId,
     MaxParts: 1000,
   })
 );
 parts.push(...data.Parts);

 // continue to get list of all uploaded parts until the IsTruncated flag is false
 while (data.IsTruncated) {
   data = await s3Client.send(
     new ListPartsCommand({
       Bucket: AWS_BUCKET,
       Key: key,
       UploadId: uploadId,
       MaxParts: 1000,
       PartNumberMarker: data.NextPartNumberMarker,
     })
   );
   parts.push(...data.Parts);
 }

 res.json(parts);

I have submitted a PR to fix that issue in the aws-nodejs example to hopefully help others avoid the same error.

I hope this helps others who are trying to setup a similar solution using Uppy

*EDIT: I have realized that the solution I have explained here does not need to change the code so much, instead it is just a matter of conditionally passing in the PartNumberMarker into the config of the ListPartsCommand that is sent to s3.

I have also updated the PR to reflect this. here is what that endpoint handler looks like now with the update.

app.get('/s3/multipart/:uploadId', (req, res, next) => {
  const client = getS3Client()
  const { uploadId } = req.params
  const { key } = req.query

  if (typeof key !== 'string') {
    res.status(400).json({ error: 's3: the object key must be passed as a query parameter. For example: "?key=abc.jpg"' })
    return
  }

  const parts = [];

  function listPartsPage(startsAt = undefined) {
    let config = {
      Bucket: process.env.COMPANION_AWS_BUCKET,
      Key: key,
      UploadId: uploadId,
    };

    if (startsAt) config.PartNumberMarker = startsAt;

    client.send(new ListPartsCommand(config), (err, data) => {
      if (err) {
        next(err);
        return;
      }

      parts.push(...data.Parts);

      // continue to get list of all uploaded parts until the IsTruncated flag is false
      if (data.IsTruncated) {
        listPartsPage(data.NextPartNumberMarker);
      } else {
        res.json(parts);
      }
    });
  }
  listPartsPage();
})

@aduh95 aduh95 changed the title uppy-multipart + golden retriever asw-s3-multipart + golden retriever May 23, 2024
@aduh95 aduh95 changed the title asw-s3-multipart + golden retriever aws-s3-multipart + golden retriever May 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants