Skip to content
This repository has been archived by the owner on May 16, 2019. It is now read-only.

Added a backup tool to import/export store data #35

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

marcjamot
Copy link

Added a backup tool in order to import and export data for the OpenBazaar server.
It has four public methods which handles backup/restore of files and export/import of data from the database with connections to the API for the client. All the methods base their work directory from OpenBazaar directory, usually "~/OpenBazaar".

backupFiles takes an output file name/path and archives all the content of the OpenBazaar directory to that file.

exportDatabase takes a list of table names and exports their data as CSV files to "OpenBazaar/backup/" directory. It also takes a boolean flag to remove previous files before generation.

restoreFiles imports the files from the given archive to the OpenBazaar directory, overwriting previous files. It also takes a boolean flag to remove previous CSV files since the importDatabase method imports all CSV files in backup directory.

importDatabase imports all CSV files from backup folder to the table, used to import data. It also takes a boolean flag to replace data instead of appending it.

@gubatron
Copy link
Contributor

might want to make all your methods with pythonic_names, in one module you did it fine but on the other one you used camelCase (I personally love camelCase better, but this is a python project and I bet you will get "yelled" at for that at some point)

import tarfile
import time

_TABLES = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think there might be a way to initialize this programatically out of the database schema? this seems like a maintenance hell.

@hoffmabc
Copy link
Member

Thanks for the submission. Will review this morning. As @gubatron mentioned we want to have consistency on code and as you can see the build failed. Just make sure it passes the pylint checks. They include the naming conventions.

def _exportDatabaseToCsv(tablesAndColumns):
"""Reads the database for all given tables and stores them as CSV files."""
dbFile = _getDatabase()
result = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result looks like an unused variable that you forgot to clean up.

@marcjamot
Copy link
Author

Yeah, I'll run it through pylint to make sure it goes well. Also the return True part makes sense. For the import of tables, does it make sense to do as now where it imports all the CSV files or should it be changed to also take a list of tables as export does? Seems the latter makes more sense when it's part of an API.

@cpacia
Copy link
Member

cpacia commented Oct 1, 2015

Thanks for this. I will review it this week.

@gubatron
Copy link
Contributor

gubatron commented Oct 1, 2015

@PLANKT :

Few more questions:

  1. Does the database have indexes you might want to re-create when importing?
  2. Instead of doing the exporting/backup manually, did you have the chance to take a look at sqlite's backup api? (I'm concerned about the maintenance issues this will have, specially the part where you sort of recreate the schema at the beginning on that _TABLES dict. it will be a never ending bug fountain forgetting to come back here and adding/removing/renaming fields and tables, if you intend to do the backups with the functions coded in the patch, you should try and form that _TABLES dict programatically out of the DB schema.)

@cpacia
Copy link
Member

cpacia commented Oct 3, 2015

I haven't had the time to test it yet, but the one comment I would make is I would put backuptool.py in the db subfolder.

@marcjamot
Copy link
Author

Regarding indexes and table structure, as mentioned it would be convenient for the backup tool to simply handle states, then it won't matter if the tables change in any way as long as the backup is between compatible versions. I looked into two solutions:

  • Backup the file. By locking the database file and doing a copy of the database file, all the data is stored and can be restored for a given time. This blocks any database access during the access and can be done with the default sqlite package.

Sqlite's official backup API seems to be missing from the Python sqlite package, there is however a module that I found have been recommended from different sites.

  • husio/python-sqlite3-backup (https://github.com/husio/python-sqlite3-backup). The backup can be done from file or memory to file or memory, so no lock for the whole database is needed. Done between two connections as simply as "sqlitebck.copy(conn, conn2)".

In both scenarios the single table backup functionality would be removed, it makes more sense to backup/restore between states of the whole database, and if single table export is wanted it could be an additional feature for datastore to prevent potential missed bugs when changing table structures.

@gubatron
Copy link
Contributor

q: what happened to the previous backup logic? I remember coding somethign which would basically make a copy of the entire db folder and .tgz it.

@marcjamot marcjamot force-pushed the develop branch 3 times, most recently from fecd9da to ca2e123 Compare October 16, 2015 07:57
@marcjamot
Copy link
Author

I changed the tool to only include two methods to simplify and do a state backup. Also moved it to db folder. The "backup" locks the database and does a full backup of the OpenBazaar data directory for the user (except the backup folder where it places the backups) and the "restore" restores the backup.

TEST_TAR = 'test.tar.gz'
TEST_FILE_1 = 'test.txt'
TEST_FILE_2 = TEST_FOLDER + os.sep + 'test2.txt'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8-nitpicking: Please add an extra blank line here, before the class definition.

@cpacia
Copy link
Member

cpacia commented Dec 11, 2015

This is something I'd like to revisit once the database schema is set for this version. Also need to make sure the json contracts get backed up.

@cpacia cpacia force-pushed the master branch 4 times, most recently from 6bf74b9 to 4aac065 Compare March 4, 2016 17:06
@drwasho drwasho added this to the Future version milestone Mar 28, 2016
@drwasho drwasho removed this from the Future version milestone Apr 20, 2016
@jonathancross
Copy link
Contributor

Hi, it has been more than a year since @cpacia's last comment. Should this PR be closed or is there a small chance it could be improved and merged?

@gubatron
Copy link
Contributor

try rebasing and see if this even makes sense anymore. I'd close, this will probably be irrelevant very soon now that the server for OB2 will be made in go

@gasull gasull mentioned this pull request Jun 18, 2017
@hoffmabc
Copy link
Member

Thanks for the bump @gasull. I think we are pretty much only making minor changes or bug fixes as a priority due to 2.0.

@cpacia do you want to review this again and perhaps if it makes sense we can put it in the next release?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants