Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: query range v3 metrics use v4 tables #5021

Merged
merged 7 commits into from
May 21, 2024
Merged

Conversation

srikanthccv
Copy link
Member

@srikanthccv srikanthccv commented May 16, 2024

Summary

Part of #4845

Related Issues / PR's

Screenshots

NA

Affected Areas and Manually Tested Areas

Copy link

Build Error! No Linked Issue found. Please link an issue or mention it in the body using #<issue_id>

@github-actions github-actions bot added the chore label May 16, 2024
@srikanthccv srikanthccv marked this pull request as ready for review May 17, 2024 10:57
@nityanandagohain
Copy link
Member

can you give me some idea when is time_series_v4 , time_series_v4_6hrs, time_series_v4_1day get used w.r.t to different queries ?

@srikanthccv
Copy link
Member Author

Here is the detail. In general, the time series data doesn't change often. The same set of labels keeps reporting the data. Example {serice:frontend, api_name:query_range} would report data every 30 seconds. To exploit this, we use two tables (to avoid the amount of redundant data needed to store and read). We have a time_series_v2 table which attempts to store at max one instance of time series. However, this assumption doesn't hold always. Take process metrics for example. The processes keep running and die and run with new IDs. Over time number of such dead and irrelevant processes time series is high.

To address this, we introduce a time_series_v4 tables which store one instance of labels for 1hr, 6hrs, and 1day i.e. the tables will have one row for a unique label set for 1hr, 6hrs, and 1day respectively. If we were only to have a 1hr table, reading 2 days of data would mean reading the same label set for each hour. To make it better, we use 1 day table. This reduces the number of the same labels read from 48 to 2.

We dynamically adjust what table to use depending on the time range of the query. for range [0-6hrs] : time_series_v4, [6hrs-24hrs]: time_series_v4_6hrs and [24hrs-]: time_series_v4_1day. All of this is done to reduce the amount of data we read.

@nityanandagohain
Copy link
Member

We dynamically adjust what table to use depending on the time range of the query. for range [0-6hrs] : time_series_v4, [6hrs-24hrs]: time_series_v4_6hrs and [24hrs-]: time_series_v4_1day. All of this is done to reduce the amount of data we read.

where is this logic present, or is it yet to be implemented ?

@srikanthccv
Copy link
Member Author

It is already part of the codebase

// start and end are in milliseconds
func which(start, end int64) (int64, int64, string) {
// If time range is less than 6 hours, we need to use the `time_series_v4` table
// else if time range is less than 1 day and greater than 6 hours, we need to use the `time_series_v4_6hrs` table
// else we need to use the `time_series_v4_1day` table
var tableName string
if end-start <= sixHoursInMilliseconds {
// adjust the start time to nearest 1 hour
start = start - (start % (time.Hour.Milliseconds() * 1))
tableName = constants.SIGNOZ_TIMESERIES_v4_LOCAL_TABLENAME
} else if end-start <= oneDayInMilliseconds {
// adjust the start time to nearest 6 hours
start = start - (start % (time.Hour.Milliseconds() * 6))
tableName = constants.SIGNOZ_TIMESERIES_v4_6HRS_LOCAL_TABLENAME
} else {
// adjust the start time to nearest 1 day
start = start - (start % (time.Hour.Milliseconds() * 24))
tableName = constants.SIGNOZ_TIMESERIES_v4_1DAY_LOCAL_TABLENAME
}
return start, end, tableName
}
// PrepareTimeseriesFilterQuery builds the sub-query to be used for filtering timeseries based on the search criteria
func PrepareTimeseriesFilterQuery(start, end int64, mq *v3.BuilderQuery) (string, error) {
var conditions []string
var fs *v3.FilterSet = mq.Filters
var groupTags []v3.AttributeKey = mq.GroupBy
conditions = append(conditions, fmt.Sprintf("metric_name = %s", utils.ClickHouseFormattedValue(mq.AggregateAttribute.Key)))
conditions = append(conditions, fmt.Sprintf("temporality = '%s'", mq.Temporality))
start, end, tableName := which(start, end)
.

@srikanthccv srikanthccv merged commit de497bf into develop May 21, 2024
15 checks passed
@srikanthccv srikanthccv deleted the v3-apis-use-v4 branch May 21, 2024 06:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants