mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   Lounge (https://www.mersenneforum.org/forumdisplay.php?f=7)
-   -   Database conversion, dbf & dbt to SQL (https://www.mersenneforum.org/showthread.php?t=26895)

Uncwilly 2021-06-11 16:26

Database conversion, dbf & dbt to SQL
 
I have a database (actually just a single table at the moment) that is in a dBase format. It has the dbf that contains the bulk of the data and a dbt that contains the "Notes" field.
(In dBase and other compatible programs a "Note" field type had a pointer within in the table that pointed to a section of the dbt. Each section was 512 B and they would be chained when there were larger notes. Many of the standard functions could not be done on note fields.)
It contains over 10,000 entries and over 130 fields per record (although I could kill off about a third of them, they were used internally to ease data entry and report generation and label printing.) That is beyond the capability of free on-line converters. Many of the other converters that I have looked into don't mention that they handle dbt's as well as the dbf. I can get it it into an xml file. I have used an online converter to take some smaller tables (~1000 or less records) that are related from xml to a mySQL file that creates the table in SQL.

Any pointers, suggestions, or offers? The table does contain personal contact info, but nothing like CC# or government ID#.

Nick 2021-06-11 18:01

If I remember rightly, the format was simple enough to reverse engineer with ease up to dBase III, but got somewhat messier after that.
The index files were separate, however, so using a hexdump to examine the .dbf and writing a C program should probably suffice.

Uncwilly 2021-06-11 18:25

[QUOTE=Nick;580694]The index files were separate, however, so using a hexdump to examine the .dbf and writing a C program should probably suffice.[/QUOTE]
Correct the index files are separate. The main issue at hand is to get the data out of that format where the software has basically gone away and into a format that can be used more cross platform and the new tech support can handle. The xml conversion is one step. I could cut the file into smaller chunks and covert each of those. The xml to sql converter I used spits out a file that creates the table. It might be something where I do 5 passes through the free converter and cut and paste them into one file one the back endo with Notepadd++. Some of the converters that I tried choked on logic (YES/NO) fields. Also there is a weird character that mark a CR or LF in the note field of those files that some times causes problems when working on the database in Access. (I had a database that was created in the program that this one was run in. I later had to use Access to maintain it.)

chalsall 2021-06-11 18:30

[QUOTE=Uncwilly;580680]Any pointers, suggestions, or offers? The table does contain personal contact info, but nothing like CC# or government ID#.[/QUOTE]

Hmmm... Tricky... PII within an ancient format... This means you need to do this yourself, and not hand it off to "experts".

Knowing you, you already have many copies of the binary data.

Have you thought about using virtual machines to get to the executable level where you can run the DB?

Nick's suggestion is also sound. The data is very likely within a reverse-engineered format.

There might even be a Perl module for that.

retina 2021-06-12 00:55

[QUOTE=chalsall;580700]PII within an ancient format... This means you need to do this yourself, and not hand it off to "experts".[/QUOTE]The OP already mentioned that online converters were an option if they worked so the PII thing is probably not an issue. Since uploading the DB to a "free" online service is obviously going to have that data mined and exploited for all it is worth with no care about confidentiality.

Uncwilly 2021-06-12 01:46

The previously converted files were adjunct files: a postal code routing to local depot table and an item number to name, mass, vendor, etc. table. When those were converted no data of value was released. The free online sites were trials to get one buy the service for 'serious' work. The software trial download that I tried worked fine, except for in about 30% of the fields it substituted "*trial*" instead of the data. The base price for the software is ~700USD. I only need it once. The software that was used to maintain the data lives on another person's machine. I had limited time to retrieve the data (this was with the permission of the computer's owner and main guardian/user of the data. I had an old USB drive that the machine was able to read. (There were multiple complications getting the data retrieved.) I will have to try to get a hold the help file or physical manual for the software and see if there are options for exporting the data to a different format.

LaurV 2021-06-12 03:45

I don't know sql, but if the files are not compressed dbf, i.e. you can see your data inside when looking with a hex editor, then any regex-aware tool (like notepad++, or pn2), or a simple perl script, could transform them in csv files in minutes (i.e. separating fields by placing commas in the right place). Then csv files can be handled by excel, or any db tools.

In the old age, FoxPro was able to handle dbf files very well. Old versions of VS which still contained FoxPro may still be downloadable.

Old CAD tools (like Protel95, before it became Altium) and Corel Draw 5 (before introducing VBA as scripting language) could also handle dbf files (even compressed).

Uncwilly 2021-06-12 05:01

[QUOTE=LaurV;580746] Then csv files can be handled by excel, or any db tools.[/QUOTE]
I have been able to get the files to xml which Excel can open. I suspect if needed I can write some VBA to spit the records out. During the coming week I will have time to work on this. I doubt that I will do anything more this weekend, because of other obligations. I do have other ideas floating around in my head.

To Chalsall's question. Once I got into the machine in question, before I knew that I was going to be able to transfer the data to the USB drive, I zipped the files (with a strong PW) in DOS (the machine boots to DOS and only runs Win98SE if you call it) and wrote the file to a 1.44. Then, once I got it on the USB, I saved it on to my laptop that I brought, then immediately emailed it to the data owner/creator and the person that will be using the SQL. (If my laptop had a parallel port I would have brought my [URL="https://en.wikipedia.org/wiki/LapLink_cable"]LapLink[/URL] cable.) So, the data does exist in multiple physical locations now.

chalsall 2021-06-12 15:01

[QUOTE=Uncwilly;580751]So, the data does exist in multiple physical locations now.[/QUOTE]

Excellent. That is absolutely the first critical step.

Did you, by chance, also clone the software (likely binaries)? A full ZIP of the software's directory is ideal (of the full filesystem is even better if that's an option).

I was serious about using VMs to get the software stack and the data running in its native environment. Sorry if I'm telling you how to chew gum, but [URL="https://www.virtualbox.org/"]Virtual Box[/URL] might be your very good friend here.

P.S. I don't like Oracle as a company; way too litigious for my comfort. But they acquired VirtualBox when they bought Sun, and fortunately continue to support it. VMWare is another option.

xilman 2021-06-12 17:02

[QUOTE=Uncwilly;580680]I have a database (actually just a single table at the moment) that is in a dBase format. It has the dbf that contains the bulk of the data and a dbt that contains the "Notes" field.
(In dBase and other compatible programs a "Note" field type had a pointer within in the table that pointed to a section of the dbt. Each section was 512 B and they would be chained when there were larger notes. Many of the standard functions could not be done on note fields.)
It contains over 10,000 entries and over 130 fields per record (although I could kill off about a third of them, they were used internally to ease data entry and report generation and label printing.) That is beyond the capability of free on-line converters. Many of the other converters that I have looked into don't mention that they handle dbt's as well as the dbf. I can get it it into an xml file. I have used an online converter to take some smaller tables (~1000 or less records) that are related from xml to a mySQL file that creates the table in SQL.

Any pointers, suggestions, or offers? The table does contain personal contact info, but nothing like CC# or government ID#.[/QUOTE]I have some elderly machines in the attic, some of which still work just fine.

I can't handle anything pre-CP/M, sorry, but there is an excellent chance that MS-DOG 3.x will still boot. Whether dBase can be installed on them remains to be seen. If it does, perhaps an ASCII dump might be possible? All this is way beyond my previous experience.

OTOH, if you could provide (a representive sub-set) of your files I see a possibly interesting cryptanalytical challenge.

How big are they? Anything less than a gigabyte, which seems rather plausible given their age, would be acceptable.

chalsall 2021-06-12 18:49

[QUOTE=xilman;580808]OTOH, if you could provide (a representive sub-set) of your files I see a possibly interesting cryptanalytical challenge.[/QUOTE]

Agreed.

How many can decode the [URL="https://voyager.jpl.nasa.gov/golden-record/"]Golden Record[/URL] using first principles?


All times are UTC. The time now is 22:39.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.