Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

if head of the [0- debounce] checks is all faild, debounce args lose efficacy #585

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

dengshaochun
Copy link

@dengshaochun dengshaochun commented Dec 7, 2017

when i set up a new check, and set debounce = 5 , if the first check is faild , there will be alarms。

looks like:

"""
case when:
len(recent_results) = 1 and recent_results[0].succeeded = 0

exec result:
return False
"""

def calculate_debounced_passing(recent_results, debounce=0):
    """
    `debounce` is the number of previous failures we need (not including this)
    to mark a search as passing or failing
    Returns:
      True if passing given debounce factor
      False if failing
    """
    if not recent_results:
        return True
    debounce_window = recent_results[:debounce + 1]
    for r in debounce_window:
        if r.succeeded:
            return True
    return False

frankh and others added 3 commits November 22, 2017 18:17
@codecov
Copy link

codecov bot commented Dec 7, 2017

Codecov Report

Merging #585 into master will increase coverage by 0.06%.
The diff coverage is 82.35%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #585      +/-   ##
==========================================
+ Coverage   80.89%   80.95%   +0.06%     
==========================================
  Files          46       45       -1     
  Lines        2952     2920      -32     
  Branches      179      178       -1     
==========================================
- Hits         2388     2364      -24     
+ Misses        505      497       -8     
  Partials       59       59
Impacted Files Coverage Δ
cabot/cabotapp/views.py 71.19% <ø> (+0.3%) ⬆️
cabot/cabotapp/jenkins.py 97.14% <100%> (ø) ⬆️
cabot/cabotapp/models/base.py 79.22% <71.42%> (-0.3%) ⬇️
cabot/cabotapp/models/jenkins_check_plugin.py 66.66% <83.33%> (+1.14%) ⬆️
cabot/urls.py 82.5% <0%> (-0.84%) ⬇️
cabot/templates/base.html 94.59% <0%> (-0.28%) ⬇️
cabot/settings.py 68.42% <0%> (ø) ⬆️
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4e4ca0a...8cedc16. Read the comment docs.

JeanFred and others added 7 commits March 9, 2018 15:53
One too many breakline.
There are vulnerabilities for the intermediate versions.

Changelog: https://docs.djangoproject.com/en/2.0/releases/#id1
We want to override `calculate_debounced_passing` for the JenkinsCheck.
Add unit tests covering
- when there is no build of the job at all
- when there is no good build (ie, only failing builds)
Debounce is “the number of successive failures
permitted before check will be marked as failed”.
It is very useful to avoid alerts on expected hiccups.

For checks whose retry logic lies in Cabot using `frequency`
(which is the case for Graphite, HTTP, and ICMP checks),
it makes sense that the debounce is about how often Cabot retried things.

For JenkinsChecks, however, we have no control over
how often Cabot checks the job. This means that even a
debounce of eg 5 can trigger an alert over 1 job failure.

A simpler implementation of this was to loop over the
recent results, count how many distinct jobs have failed,
(using the job number stored in the `status_check_result`),
and set the status to fail if this is higher than the debounce.
However, Cabot only considers the last 10 results (hardcoded value).
Since Cabot checks the job at fairly high frequency (or at least a
frequency higher than the Jenkins run frequency), this can mean
the status would switch to pass after 10 checks of a single check failure.

We thus need to enrich the StatusCheckResult data model
to store that information.

- Add field `consecutive_failures` to StatusCheckResult model
  (and associated migration).

- Retrieve from Jenkins the last good build, and compute from
  that the number of consecutive failures

- Also display the consecutive failures in the Check results page

Closes arachnys#537
@frankh frankh force-pushed the master branch 2 times, most recently from eaa8c3b to 20fada0 Compare March 21, 2018 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants