Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: dates with spaces or no month/day are not able to be parsed #2833

Open
2 tasks done
michaelkrieger opened this issue Jan 30, 2024 · 4 comments
Open
2 tasks done

Comments

@michaelkrieger
Copy link

I confirm that:

  • I have searched the existing open AND closed issues to see if an issue already exists for the bug I've encountered
  • I'm using the latest version (your issue may have been fixed already)

Version

0.51.0

Current Behavior

Errors are logged:

msg="Error parsing date field for month + day" date="2014- 6- 9" file="...m4a"
msg="Error parsing date field for month + day" date="2011- 9-27" file="...m4a"
msg="Error parsing date field for month + day" date="2014- 6- 9" file="....m4a"
msg="Error parsing date field for month + day" date=2013-00-00 file="....mp3"

Expected Behavior

Navidrome should use its best efforts to accommodate these odd tags

Steps To Reproduce

  1. Have a file with a date format including these spaces instead of a zero for the date
  2. Scan the library

Environment

- OS: Debian Linux / Docker
- Browser: Chrome
- Client: N/A

How Navidrome is installed?

Docker

Configuration

ND_SCANSCHEDULE: 24h
      ND_LOGLEVEL: info
      ND_SESSIONTIMEOUT: 48h
      ND_PLAYLISTSPATH: "**/Playlists/**"
      ND_ENABLEEXTERNALSERVICES: "true"

Relevant log output

msg="Error parsing date field for month + day" date="2014- 6- 9" file="...m4a"
msg="Error parsing date field for month + day" date="2011- 9-27" file="...m4a"
msg="Error parsing date field for month + day" date="2014- 6- 9" file="....m4a"
msg="Error parsing date field for month + day" date=2013-00-00 file="....mp3"


### Anything else?

First, I think both of these date formats are improper.  Clearly a leading zero should be in the date (ie: 2014-06-09) and the 2013-00-00 should just be 2013.  That said, a surprising number of files, particularly M4A files, seem to have dates in this format, which makes me think it's out there, possibly produced by an Apple service or some tagger.

I do think that Navidrome, given that these formats are out there and appear to be pretty widely out there, should be able to interpret these dates, removing the month-day for the 00-00 and adding the 0's for the date with spaces.

### Code of Conduct

- [X] I agree to follow Navidrome's Code of Conduct
@michaelkrieger michaelkrieger added bug triage New bug reports that need to be evaluated labels Jan 30, 2024
@deluan
Copy link
Member

deluan commented Jan 31, 2024

First, I think both of these date formats are improper.

Glad you acknowledge that :)

There has been a lot of discussions regarding parsing dates and we decided to go with a strict validation: Dates can be in the formats YYYY-MM-DD, YYYY-MM or YYYY, padded with zeroes and that's it. Most if not all modern taggers use these formats (@certuna can keep me honest here)

I never saw the formats you are seen in the wild, and I even have a bunch of M4A files in my library, either bought from Apple or tagged with iTunes.

Having said that, I don't see any issues supporting these two (invalid) formats (YYYY-00-00 and space-padded day and month), if @certuna does not see any issues with that.

I'll tag this issue as a good-first-issue and leave it as an exercise for new-comers to the project. The function that parses the date is found here:

func (t Tags) getDate(tagNames ...string) (int, string) {

and its tests are here:

DescribeTable("getDate",
func(tag string, expectedYear int, expectedDate string) {
md := &Tags{}
md.Tags = map[string][]string{"date": {tag}}
testYear, testDate := md.Date()
Expect(testYear).To(Equal(expectedYear))
Expect(testDate).To(Equal(expectedDate))
},
Entry(nil, "1985", 1985, "1985"),
Entry(nil, "2002-01", 2002, "2002-01"),
Entry(nil, "1969.06", 1969, "1969"),
Entry(nil, "1980.07.25", 1980, "1980"),
Entry(nil, "2004-00-00", 2004, "2004"),
Entry(nil, "2016-12-31", 2016, "2016-12-31"),
Entry(nil, "2013-May-12", 2013, "2013"),
Entry(nil, "May 12, 2016", 2016, "2016"),
Entry(nil, "01/10/1990", 1990, "1990"),
Entry(nil, "invalid", 0, ""),
)

One just need to add the necessary formats to the tests, and make the code works with it.

If anyone wants to work on this, feel free to submit a PR. Reach out if you have any questions.

@deluan deluan added enhancement go Go code good first issue and removed bug triage New bug reports that need to be evaluated labels Jan 31, 2024
@certuna
Copy link
Contributor

certuna commented Feb 1, 2024

Yeah these are invalid date formats, so they aren't parsed. Not a bug.

Up to you @deluan how much effort we want to put into the code to parse this. We'll never be able to catch all: 2001/12/31 or 12-31-2001 or 31.12.2001 or Wednesday 31 December 2001, there are infinite ways that people can write off-spec dates in that frame.

@michaelkrieger
Copy link
Author

I certainly think the 2023-00-00 is fairly common and I was shocked how many m4a’s had the spaces.

I think the 00-00 should be handled and treated as a year. I don’t really know if the one with spaces should be (as I agree it is wrong) but it just seems easy enough to detect.

many of the others aren’t used for tags as they’re not easily machine readable.

@certuna
Copy link
Contributor

certuna commented Feb 2, 2024

2023-00-00 and 2023- 1- 2 are both parsed as 2023, as you can see from the log, it only fails on the month+day part, the year is parsed correctly.

If you look at the code, it does three steps with the tag value:

  1. checks if the first 4 characters are a valid year (=regex matching 1000-2999) -> this sets the Year (int) and the Date (string in the form YYYY)
  2. if the tag value has >4 characters, checks with the time.parse function if the tag value can be parsed as YYYY-MM-DD, if so: Date = tag value
  3. if 2. fails, checks if YYYY-MM works, if so: Date = tag value

(this is done in this order for speed/efficiency: if the tag only has a 4-digit year, it's a waste of time to parse it as a full date)

So what happens when it encounters 2023-00-00?

  1. Year = 2023 , Date = "2023"
  2. fails
  3. fails -> "Error parsing month+day" in log

When it encounters 2023-12:

  1. Year = 2023 , Date = "2023"
  2. fails
  3. Date = "2023-12"

When it encounters 2023- 2- 1:

  1. Year = 2023 , Date = "2023"
  2. fails
  3. fails -> "Error parsing month+day" in log

When it encounters 2023:

  1. Year = 2023 , Date = "2023"
    (2. and 3. are not done because the tag value is <5 chars)

When it encounters a non-date value like Bob:

  1. Year = 0 , Date = "" -> "Error parsing year" in log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants