Skip to content
This repository has been archived by the owner on Jul 23, 2019. It is now read-only.

Latest commit

 

History

History
63 lines (32 loc) · 9.76 KB

2018_04_02.md

File metadata and controls

63 lines (32 loc) · 9.76 KB

Update for April 2, 2018

Contributions

@chgibb helped us get an initial Travis build in #48. This partially addresses our help-wanted issue, but we're still going to leave it open since we want to run the minimal tests for a given change to mitigate one of the downsides of being a monorepo. Thanks to @chgibb for getting this started.

Almost done with the file finder

Our main focus last week was finishing up the file finder feature that I also discussed in the previous update. The last update was all about our approach to scanning the directory tree from the file system into an in-memory representation, and the approach we described remains pretty much unchanged. We plan to merge the pull request early this week.

Leveraging prior art

Last week was all about using that in-memory representation to return search results based on a "fuzzy" search query. After an initial attempt that yielded decent performance but poor ranking of the search results, we decided to investigate existing solutions. We tried two command-line fuzzy finders, fzy (written in C) and fzf (written in Go) on the Electron repository, which contains over 500,000 files when .gitignore is disabled.

Both tools yielded excellent performance and high quality results, and since the core matching algorithm behind fzy was reasonably straightforward to read and understand, we decided to port it to Rust. You can read more about the algorithm in the fzy repository, but at a high-level, their solution is based on dynamic programming and determines the optimal match positions for a given substring by populating a matrix with cascading values. We copied their basic approach almost exactly, but we also enhanced it a bit to make use of the existing tree structure to recycle computation for common path prefixes.

Matching and scoring

Xray matches paths in two phases. First, we scan the tree to determine which paths match the query, populating a hash map to mark which file system entries either match the query or contain matches to the query. Simply matching the query only requires us to perform linear character comparison and is fairly cheap to perform, and this allows us to constrain the search space for the next step. Once we determine matches, we then walk the tree to associate each matching path with a score. Scoring is O(N*M), where N is the length of the query and M is the length of the path. Luckily, longer queries tend to match fewer paths, which means when it is most expensive to compute scores, we usually end up needing to compute fewer of them.

Results

Overall, we're happy with the results. The quality of the matches is extremely high thanks to the work @jhawthorn put into tuning the scoring criteria. Since ranking matches is somewhat subjective, basing our results on an existing, fairly mature solution gives us a lot more confidence in the quality of the results. The performance is also pretty decent. Searching for init in the 151,201 files of the blink repository yields results in ~120ms on my machine. Searching for init.py, which is a more selective query, drops that to ~16ms.

Future improvements

These early results are good, but we think there's room for improvement. First, we're still matching on a single thread, and it seems like we might be able to use Rayon to parallelize the matching over multiple CPU cores. We could also do a better job reporting progress. 20ms into the query we could check if we are more than 20% complete with ranking, and if we aren't we could display some sort of subtle progress indicator. That could help the search feel responsive even if it takes 100+ms to return results. That said, we're going to call this good for now and move on to other areas. The file finder feels fast and fluid now, even for big repositories, and we think we have a solid foundation in place for future improvements.

Other improvements

Since we're still fairly early in development, we're allowing branches to get longer and heavier than we might in a more established project. Folded into the file finder branch are a few smaller improvements that made sense to add along the way.

Window and view API refinements

We display the file finder as a modal in the workspace, and when the user selects a file or cancels the modal, we need to take action in the workspace. After pondering a couple of approaches, we ended up deciding to use a fairly traditional delegate pattern here, where the WorkspaceView implements the FileFinderViewDelegate trait and passes a weak reference of itself to the FileFinderView.

Trouble is, how does the WorkspaceView obtain a weak reference to itself? Since the Window wraps each view in an Rc<RefCell>, we ended up deciding that it would be convenient for the window to pass each view a WeakViewHandle to itself in the view's will_mount hook. Many views can simply ignore this parameter, but if views need to perform delegation they can safely store and clone it without worrying about leaking memory, enabling them to hand itself as a delegate of child views. This is how we connect actions dispatched on the FileFinderView to state changes in the workspace.

Focus API

We also needed a way to focus the file finder when it displays, then focus the newly opened editor after a file is selected. We decided to implement this on the server side via the new ViewHandle::focus method. Whenever this method is called, it assigns the focused field on the Window to the focused view's id. This gets relayed to the client, which calls the focus method on the corresponding React component.

For now, we aren't interested in replicating the focus state to the server. Server-side code can request that a view be focused, but it can't ask which view is currently focused. This is a decision we can revisit later, but focus is a very weird piece of global state that references individual DOM nodes, so it doesn't seem worth the complexity of attempting to represent it outside of the browser environment. This means that the modal panel will still need to have a bit of custom focus handling logic in order to restore focus to the previous element when cancelled, but so far this seems manageable.

CLI improvements

We've also changed the structure of the CLI's relationship with the server and Electron slightly. Previously, when we spawned Electron, we could ask it to relay a message to the server via the XRAY_INITIAL_MESSAGE environment variable. Now, the CLI waits for the Electron app to emit Listening\n on stdout, then attempts to connect to the server itself to send the initial message.

We made this change to deal with error handling. The server may need to report an error message to the CLI over the socket, and this was going to be complicated to achieve with the previous approach of delegating the initial message send to Electron.

Waiting for Electron to tell us the server has started may introduce some latency, which is why we initially preferred the delegation approach, but we'll need to actually measure this before the additional complexity is warranted in light of the need to receive a response from the server.

The week ahead

We hope to merge the file finder PR. All that's left is some basic styling and iteration on focus handling.

After that, we plan to start working on shared headless workspaces. The hardest part is enabling concurrent text editing, but that's pretty much solved by our use of a CRDT as Xray's core text-storage structure. However, there's still plenty of complexity remaining in terms of how we actually connect buffer replicas together and structure the client/server interaction.

We plan to explore Cap'n Proto RPC, which seems to have an actively-maintained Rust implementation. None of us has ever used it, so we'll need to see how the reality matches up to its promises, but on initial investigation it looks like it could be a good fit for Xray's needs.

Cap'n Proto offers a compact yet evolvable binary representation for messages, and the RPC system seems like it makes it easy to expose any object over the network in a secure way and efficiently call its methods. As long as they're well-implemented, these features seem sufficiently general to be a foundation for network interaction between Xray instances.

At this point, Xray is still too young to be usable. But we're trying to ruthlessly prioritize and zero in on the highest value and highest risk aspects of the system as soon as possible. It's unfortunate that Xray doesn't build on Windows right now, but there's honestly not that much to see or use anyway. If you're a Windows user and you're interested in helping out, getting a named-pipes- or TCP-based connectivity solution in place on Windows would be a great place to start.