Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.12 Performance #648

Closed
idatsy opened this issue Mar 8, 2024 · 10 comments
Closed

3.12 Performance #648

idatsy opened this issue Mar 8, 2024 · 10 comments
Assignees
Labels

Comments

@idatsy
Copy link

idatsy commented Mar 8, 2024

Describe the bug
Significantly worse performance on python 3.12

Minimal working example
Need to line profile in more detail but appears that Machine creation is about 5x slower in my use-case

Additional context
Highly anecdotal at the moment but figured I'd see if anyone else is having a similar issue. For my use-case its probably equally expensive to re-write than to debug.

@aleneum
Copy link
Member

aleneum commented Mar 8, 2024

Hello @idatsy,

thanks for the info. If you have a MWE for reproduction let me know.

@philipbjorge
Copy link

philipbjorge commented Apr 25, 2024

Update

The below report was related to usage of transitions on many models.
#146 (comment)

We didn't have a clean way to use a global state machine + add_model and remove_model so we've resolved the issue by removing the library.


We're seeing poor performance on python 3.11 -- I don't have data from earlier python versions to indicate whether this is a regression or not.

CleanShot 2024-04-25 at 11 06 00

Using...

        self._machine = Machine(
            model=self,
            states=EvaluationState,
            initial=self.state or EvaluationState.CREATED,
            transitions=self.transitions(),
            send_event=True,
            auto_transitions=False,
        )

With 13 state transitions.

@aleneum
Copy link
Member

aleneum commented May 6, 2024

Significantly worse performance on python 3.12

Unfortunately,

I cannot reproduce @idatsy's observation. Machine initialisation got slightly faster for Python 3.11 and 3.12.

image

There does not seem to be a significant difference between different versions of transition:

image

There seems to be some increase in latency when auto_transitions=True but its nowhere near 400% (more like 20%). Single transitions (trans=True) do not influence the initialisation time much. When trans=True, one state transition is added with the trigger 'go' from state<X> to state<X+1> (N - 1 in total). But as I said, their influence is neglectable, especially compared to auto_transitions that add 'N - 1' transitions to each state. The transition version order for each bar group is (0.7.2, 0.8.0, 0.8.11, 0.9.0).

image

I will test whether there has been regression for transition executions and multiple models. But leads would be highly appreciated.


This is my test code (tested on a MacBook M1 2021 with Conda)

from transitions import Machine, __version__

state_nr = [1, 5, 10, 20, 50, 75, 100]
trials = 5
n_runs = 2000 // trials

state_dict = {n: [f"State_{i}" for i in range(n)] for n in state_nr}
trans_dict = {n: [["go", a, b] for a, b in zip(state_dict[n][:-1], state_dict[n][1:])] for n in state_nr}


def test_init_machine(n, with_auto, with_trans):
    for _ in range(n_runs):
        _ = Machine(states=state_dict[n], transitions=trans_dict[n] if with_trans else None, initial=state_dict[n][0], auto_transitions=with_auto)

if __name__ == '__main__':
    import timeit
    import sys
    pyver = f"{sys.version_info.major}.{sys.version_info.minor}"
    print(f"{pyver} : {__version__}")
    with open('init_results.csv', 'a') as f:
        for n in state_nr:
            for with_auto in [True, False]:
                for with_trans in [True, False]:
                    samples = [timeit.timeit(f'test_init_machine({n}, {with_auto}, {with_trans})', globals=globals(), number=1) * 1000 / n_runs for _ in range(trials)]
                    mean = sum(samples) / trials
                    stddev = (sum((x - mean) ** 2 for x in samples) / trials) ** 0.5
                    print(f'"{pyver}","{__version__}","{n}","{with_auto}","{with_trans}","{mean:.2f}","{stddev:.2f}"', file=f)
#!/bin/bash

envs=("test-3.12" "test-3.11" "test-3.10")
versions=("0.7.2" "0.8.0" "0.8.11" "0.9.0")

. /opt/homebrew/Caskroom/miniconda/base/etc/profile.d/conda.sh
for env in ${envs[@]}; do
	conda activate ${env};
	for version in ${versions[@]}; do
		echo "Testing transitions version ${version} in ${env}";
		pip install transitions=="${version}";
		python test_init.py;
	done;
	conda deactivate
done;

@aleneum
Copy link
Member

aleneum commented May 6, 2024

Same graphs as above but added unreleased 0.9.1 version. Observations stay the same:

image

image

image

@aleneum
Copy link
Member

aleneum commented May 6, 2024

Test results for 0.9.1 with varying versions of Python:

Python 3.11 and 3.12 seem to result in faster initialisation with 50 states and 50 models along the line:

Machine(model=[Model() for _ in range(m)], states=state_dict[n], transitions=trans_dict[n], initial=state_dict[n][0], auto_transitions=with_auto)

image

There is an increase in initialisation time starting from 0.9.0 with multiple models. I guess that this bump is caused by the added "may_" trigger and _add_may_transition_func_for_trigger. may_ is also added for auto_transitions. I'd assume this is not necessary and could be omitted. However, the increase is notable but does not double init time or worse:

image

And another test with varying amounts of models showing similar correlation between states (and thus auto transitions) and number of models:

image

@aleneum
Copy link
Member

aleneum commented May 6, 2024

Cycling n models through n states (with one model callback passed as name) also implies that there was a speed improvement since Python 3.11:

machine = machines_dict[n]  # an already initialised machine with n states
states = state_dict[n]
while machine.models[0].state != states[-1]:
    machine.dispatch("go")
machine.dispatch("to_" + states[0])  # reset to initial state

image

A change in 0.8.0 resulted in some overhead though:

image

@aleneum
Copy link
Member

aleneum commented May 6, 2024

Update: Some division issue distorted the calculated times. In the previous diagram y-axis ranged up to 4000ms. THAT would have been a hefty performance hit. With this fixed times are now similar to string based states yet slightly worse.

Similar picture for enum states and adding, removing models:

state_dict = {n: Enum("StateEnum", {f"State{i}": i for i in range(n)}) for n in state_nr}

machines_dict = {
    n: Machine(
        model=None,
        states=state_dict[n], initial=state_dict[n].State0,
        after_state_change="callback",
        auto_transitions=False,
        ordered_transitions=True
    )
    for n in state_nr
}


def test_enum(n):
    machine = machines_dict[n]
    stack = []
    for _ in range(n):
        model = Model()
        machine.add_model(model)
        stack.append(model)
        machine.dispatch("next_state")
        while model.state != state_dict[n].State0:
            machine.dispatch("next_state")
    while stack:
        machine.remove_model(stack.pop())

Machines are already initialised but doubling n means twice the models to initialise, to add (and thus to decorate) and to remove (and un-decorate) and states to cycle.

Python 3.10 seems to be slightly notably slower than 3.11 and 3.12

image

A doubling in models and states from 50 to 100 increases the run time from 500ms to 2500ms, an increase of 400%. But this increase does not look suspicious to me since every time a model is added, all already added models are cycled once and each cycle is twice the length.

image

@aleneum
Copy link
Member

aleneum commented May 6, 2024

Last check related to @philipbjorge's observation:

I checked a decorated and undecorated version of:

  1. adding n models
  2. cycling n models through 50 states
  3. deleting n models

with preinitialised machine and states:

states = Enum("StateEnum", {f"State{i}": i for i in range(50)})

machine = Machine(
        model=None,
        states=states, initial=states.State0,
        after_state_change="callback",
        auto_transitions=False,
        ordered_transitions=True
    )
decorated undecorated
class Model:

    def callback(self):
        pass

# ...

for _ in range(n):
    model = Model()
    machine.add_model(model)
machine.dispatch("next_state")
while model.state != states.State0:
    machine.dispatch("next_state")
while machine.models:
    machine.remove_model(machine.models[0])
class Model:

    def __init__(self, machine) -> None:
        self.machine = machine
        self.state = self.machine.initial
        self.machine.models.append(self)
    
    def trigger(self, trigger_name, *args, **kwargs):
        return self.machine.events[trigger_name]\
            .trigger(self, *args, **kwargs)
    
    def callback(self):
        pass

# ...

for _ in range(n):
    model = Model(machine)
for mod in machine.models:
    mod.trigger("next_state")
while model.state != states.State0:
    for mod in machine.models:
        mod.trigger("next_state")
while machine.models:
    _ = machine.models.pop()

Adding 1000 models, cycling through 50 states and removing models 'the standard way' takes about 350ms (ob my Macbook) while the undecorated version derived from the mentioned comment is roughly 30% faster. I could not see other differences than the one observed before between Python or transitions versions. However, data shows that the 0.7/0.8 performance gap is independent of model decoration though.

image

Summary: Even though this 'benchmark' was rather 'qualitative' than 'accurate', I'd assume that drastic changes in performance should have been observable. Let me know if I overlooked something or screwed up somewhere on the way. As of now, I could not find performance issues as described by @idatsy and @philipbjorge. If anyone else also faces such performance issues and could provide some leads, that would be much appreciated.

@aleneum
Copy link
Member

aleneum commented May 6, 2024

For the record (and/or masochists) I add my test scripts:

transitions-benchmark.zip

@aleneum
Copy link
Member

aleneum commented May 24, 2024

I will close this issue since there has not been any feedback in the last three weeks and I could not reproduce the observed performance issues. Feel free to comment nevertheless in case you face performance issues and you can pinpoint this to some changes in transitions. I will reopen the issue if required.

@aleneum aleneum closed this as completed May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants