Sunday, April 20, 2014

Batch Compressing (and decompressing) Multiple Directories

Hi all, thought I'd share another cool thing I did.

I like to back up my old data, and I have a ton of directories with hundreds of thousands of files in them. Compressing them would make them 1. much smaller, and 2. easier to store. But compressing everything into one file would be dangerous and very difficult to manage, and compressing every folder individually would take forever.

Fortunately, 7-zip has some nice command line tools. Using these, I wrote a small script that I can put anywhere, and double-clicking on it will compress every folder next to the script into a separate .zip file, AND test them for errors. Here's how:

Compressing

Step 1: Get 7-zip (Google it).

Step 2: Make a new text file, and paste in the following code (in blue):

@echo off
for /d %%G in (*) do "C:\Program Files (x86)\7-Zip\7z.exe" a "%%G.zip" "%%G\" -mmt
for %%G in (*.zip) do "C:\Program Files (x86)\7-Zip\7z.exe" t "%%G" * -r  | FIND /v "Testing " | FIND /v "Copyright "
pause

NOTE: You may need to change the "C:\Program Files (x86)\7-Zip\7z.exe" sections to point to where you installed 7-zip, if it's different from me.

Step 3: Save this file with any name, with the .bat extension (for example, batch-compressor.bat).

Step 4: Put this file somewhere that has a bunch of directories in it.

Step 5: Double-click the file (do NOT run as administrator). A command prompt will pop up and show you the progress. When it is done, look through and make sure there were no errors and all the sections say "Everything is OK".

You should notice a bunch of .zip files next to your folders now! Just delete the folders if you don't want them anymore, and keep the .zip files.

Notes:
  • This only compresses folders and the files in those folders, not files in the same directory. You'll need to do that yourself if you want.
  • This does not delete the original folders; if you want to do that, just delete them when it's done. Make sure your drive has enough room for all the new .zip files, especially if you're compressing a lot of data.
  • Personally, I would recommend not making each zip file larger then a few GB each, preferably much smaller. Too big archives are at higher risk of corruption or some issue happening, which would affect a larger amount of data, and it's also a pain if you want to move specific files around later.
Decompressing

Now how do you decompress (extract) these archives you just made? Again, decompressing them manually would take forever. Well, there's a script for that too.

Step 1: Make a new batch file, the same way as above. Paste in the following code (in blue):

@echo off
for %%G in (*.zip) do "C:\Program Files (x86)\7-Zip\7z.exe" x "%%G" -mmt
pause

NOTE: Again, you may need to change the "C:\Program Files (x86)\7-Zip\7z.exe" section to point to where you installed 7-zip, if it's different from me.

Step 2: Save the file with a new name, and as a .bat (for example, batch-decompressor.bat)

Step 3: Put the file next to all the archives you want to extract.

Step 4: Double-click the file (do NOT run as administrator). A command prompt will pop up and show you the progress.

And you're done! Now you have a way to compress lots of folders into individual archives, and a way to extract all of them again in one fell swoop.

Happy archiving!

Sunday, September 29, 2013

Pleco - Cantonese and Historical Chinese dictionaries

My god, it's been... 3 years since I've posted here?
Well, I've decided to make this my place to put some useful information for people that I find out and isn't easily searchable on the internet.
On that note, today's post is regarding Pleco, the fabulous mobile Chinese dictionary. I found myself wanting a good Cantonese dictionary, and also a good Historical Chinese dictionary, but none were available. So, I made my own (that is, found them on the internet and edited them / converted them to Pleco format).

Details and instructions for importing Yedict (a Cantonese dictionary based of CEDICT and other material) and a list of Chinese characters with Baxter's middle and old Chinese readings/definitions can be found below, for your perusal. I won't provide the files themselves though, because I am not totally sure as to the copyright statuses of the sources.

YEDICT
A large compilation of quasi-free Cantonese Chinese definitions and readings. Most of the definitions are from CEDICT, already available on Pleco for free, but it does not have Cantonese readings. Furthermore, Yedict (so it's called) used Cantonese definitions merged frrom Cantonese Stardict, so it's much more useful for those specifically-Cantonese terms.

Source: http://writecantonese8.wordpress.com/2012/02/04/cantonese-cedict-project/
SPECIFICALLY, tym's amended version (from the comments), which saves some time: http://dl.dropbox.com/u/3648660/yedict_20130108.7z

A lot of these instructions were modelled from alex_hk90's instructions, and those also inspired me to write this. If parts of this are confusing, he may have a better explanation in his post here: 
http://www.plecoforums.com/threads/user-dictionary-specification.3218/page-2
However, his instructions involved using commands on an application that only seems to run (well) on *nix systems (Linux, OSX). So I wrote these to be used on any computer (that can rune Notepad++ (all of them?)). The regular expressions are largely similar, but there are some differences.
If you don't know what Notepad++ is, google it.

Right now, the entries should look like this:
繒 缯 [jang1] [zeng1|zeng4] /silk fabrics/surname Zeng/to tie/to bind/

-Change to single delimiter
Use Notepad++ regular expression find/replace with: 
Find: (.+?) (.+?) \[(.*?)\] \[(.*?)\] /(.+)
Replace with: \1@\2@\[\3\]@\[\4\]@/\5
NOTE there is a space after the colon; do not include that space in the find/replace fields.

-Add Pleco definition formatting
1. Add bullet-point to the beginning of defs.
Find: (.+?)@/(.+)/
Replace with: \1@•\2
2. Add bullet-point to the middle parts of defs.
Find: /
Replace with: • 
NOTE the special character here. This is a special Unicode character that Pleco uses to break lines. The bullet is for cosmetic purposes.

-Convert to Pleco card format.
Find: (.*)@(.*)@\[(.*)\]@\[(.*)\]@(.*)
Replace with: \2[\1]\t\4\t[\3]\5

Now, the entries should look something like this:
缯[繒] zeng1|zeng4 [jang1]•silk fabrics• surname Zeng• to tie• to bind

-Manually break into smaller files (?)
This is due to Pleco being EXTREMELY slow in importing. I broke into 22 10,000 definition files, but that was maybe a bad idea. A better idea would be to break into fewer larger files, like maybe 30,000, and let it run overnight every night for a week. Remember to BACK UP THE DATA FILE while importing. And after import, CHECK THE ENTRY COUNT to make sure it didn't stop midway through and not inform you (which it does, a lot).

-Move files to phone and import into a new Pleco user dictionary. The file should be .txt. A few of the entries have errors, but that's something you can fix by hand as you find them, if you wish.


HISTORICAL
A little over 4000 characters and their middle and old Chinese readings and meanings. I took the liberty of making it much cleaner and easier to read for non-Chinese historical linguists, while retaining accuracy as much as possible. This uses the same method as Yedict, above.

Source: http://crlao.ehess.fr/docannexe.php?id=1221

The entries should look like this:
āi ai1 'oj '- -oj A *qˤə dust 0938b U+57C3

-Get rid of extra data (pīnyīn, MCI, MCF, GSR, UTF-16)
Use Notepad++ regular expression find/replace with:
Find: (.)\t(.*)\t(.*)\t(.*)\t(.*)\t(.*)\t(.*)\t(.*)\t(.*)\t(.*)\t(.*)
Replace with: \1\t\3\t\4\t\7\t\8\t\9

-Check for missing tabs after entries and manually add them (encoding issue?)
Find: ^(.)[^\t](.*\w+)

-Get rid of Baxter's tone tags on MC (he numbers them in a separate field, so these are redundant)
Find: ^(.*)\t(.*)\t(.*)[XH]\t([ABCD])
Replace with: \1\t\2\t\3\t\4

-Change tone letters to names
Find: (.*)\t(.*)\t(.*)\tA\t(.*)\t(.*)
Replace with: \1\t\2\t\3\teven\t\4\t\5
Find: (.*)\t(.*)\t(.*)\tB\t(.*)\t(.*)
Replace with: \1\t\2\t\3\trising\t\4\t\5
Find: (.*)\t(.*)\t(.*)\tC\t(.*)\t(.*)
Replace with: \1\t\2\t\3\tdeparting\t\4\t\5
Find: (.*)\t(.*)\t(.*)\tD\t(.*)\t(.*)
Replace with: \1\t\2\t\3\tentering\t\4\t\5

-Add labels
Find: (.*)\t(.*)\t(.*)\t(.*)\t(.*)\t(.*)
Replace with: \1\t\2\tMC: \3 \(\4 tone\)\tOC: \5\t\6

-Add Pleco definition formatting
Add bullet-point to the beginning of defs.
Find: (.*)\t(.*)\t(.*)\t(.*)\t(.*)
Replace with: \1\t\2\t\3\t\4\t• \5


-Rearrange and convert to Pleco card format
Find: (.*)\t(.*)\t(.*)\t(.*)\t(.*)
Replace with: \1\t\2\t\5 \3 \4

By now, the entries should look like this:
ai1 • dust MC: 'oj (even tone) OC: *qˤə 

-Manually delete first line (the fields key)

-Move file to phone and import into a new Pleco user dictionary. The file should be .txt. A few of the entries have errors, but that's something you can fix by hand as you find them, if you wish.

Monday, September 13, 2010

Relationships: How real are they?

Let's be honest with ourselves here when we say that we probably spend almost as much time, if not more time, communicating with people and socialising through technology than face-to-face. Obviously, this was not the case a century ago.
So a thought came to me.
Say, you're posting on an online forum and you find someone you share interests with. You start chatting with that person over instant messaging. Pretty soon, you find yourself texting them about all your littlest thoughts, like "omg that film was teh sux, what do u think?" and "i hate espresso, dont u?". Then one day you finally propose, "hey wanna skype?" They refuse, and point out the fact that they are not, in fact, a real person, but just a cleverly designed AI.
What would you do? Would the fact that they're not human completely change your relationship with them? Or maybe you should ask yourself, how different is he/she to a human after all? Hell, you sure couldn't tell the difference.
I feel this may become a legitimate social issue before long. Anyone have any thoughts? Better question, is anyone still reading this?

Hi

This blog's probably been read some 1 times since my last post, so for my 1 loyal follower, I'm back! Well actually probably not, my life isn't interesting enough/I'm not self-centred enough to become a ritualistic blogger. Granted, I do frequently come across information that may be useful to the general public, but hell I'm way too lazy to post stuff like that.

Thursday, July 30, 2009

Vocaloid

Hello my (not) numerous readers.

Something I've done recently. I got myself the software Vocaloid 2, which I must admit is quite a piece of work. It's typically in Japanese but there's an English version as well. There's a fairly large culture in japan surrounding Vocaloid, and specifically, one of its voices- Miku Hatsune (初音ミク), who acts as a sort of mascot along a bunch of other mascots. Someone even created a software for making dance music videos using 3D models of Miku and the others, called MikuMikuDance. The newer versions also have (moderately well translated) English language support.

Here's a video documentary of Miku Hatsune (Hatsune Miku in Japanese name order), with Vocaloid 2 and MikuMikuDance making appearances. There are English subtitles available.



I've been playing around a bit with the Vocaloid 2 software and MikuMikuDance. Both are very easy to use for their power. I may mention them again in later updates.

Tuesday, July 21, 2009

Update

Hello, sorry I have not posted recently (like anybody noticed). I'll get back to you all as soon as I come up with something fun to post about. Which hopefully will be shortly.

Sunday, June 28, 2009

Americans And Their Cars

I wanted to bring this up because I have some opinions about Americans and their cars.
No offence to those Americans who have enormous cars, but I have some problems with these things.

Why do Americans drive ridiculously large cars? And why do American car companies glamorize the giant gas-guzzler? So what if your pickup can tow 600 million pounds? When the hell are you ever going to use that? In fact, if you really think about it, are you ever going to tow more than maybe a beer cooler and a couple of hunting rifles? Unless you either are in the construction industry or you have a very heavy mobile home, YOU DON'T NEED A GIGANTIC PICKUP.
And you with the SUV, how often do you really carry 7 people in your car and a trunk full of stuff at the same time? Maybe you do often, in which case that's fine. But if it's around once a year or less (which it most likely will be), then YOU DON'T NEED AN SUV.
Now as for Hummers, I have one thing to say to you: YOU DON'T NEED A HUMMER, FOR ANY REASON, EVER.

If you legitimately need a large car for your business or you have a family of like 8 that you frequently carry around, than this doesn't apply to you, but if all you use your car for is a 45 minute drive to work and back and to pick up pizza, then GET A FRICKIN' SEDAN, or even better, A SMARTCAR. Or if you want to go extreme, get this.

This post is dedicated to American personalities Ed McMahon, Farrah Fawcett, Michael Jackson, and Billy Mays, who all died this week. Stop killing off your celebrities, America!

Friday, June 26, 2009

Firefox Plugins I Couldn't Live Without (for me, at least)

There are a certain few plugins for Firefox that I just love. Let's go over them:

1. Perapera-kun: A very useful and handy popup translator/dictionary/kanji lookup for Japanese text. Works much better than translators. A Chinese version is available too. The homepage can be found here. Requires dictionary found here, download the dictionary/ies of your choice.

2. Babelfish Instant Translation: A little app that translates selected text using GoogleDic, Google translator, or Yahoo translator into or from any of the supported languages. Homepage here.

3. Download Statusbar: Replaces that bulky download box thing in Firefox. Homepage here.


4. Video DownloadHelper: Makes it really to download about any video media embedded on a site, such as Youtube, or my personal favourite, ニコニコ動画. Homepage here.

5. Flagfox: Displays a country flag next to the URL of a site or on the status bar, and clicking on it will give you complete information of the server's location. Homepage here.

6. Personas for Firefox: Makes Firefox look a hell of a lot nicer. My personal favourite is the "Foxkeh" category. Homepage.


7. Scrapbook: Allows you to save an entire webpage to your computer, or even a whole site, and much more. Homepage.


8. StumbleUpon: The toolbar for the popular service, which opens a random high-rated site based on your interests. Terrific when you're bored. Also integrates with Google, etc. Homepage.

That about covers most of my plugins. Hopefully you will find some of these useful as well.

こんばんは!

You may have noticed that my header (as of posting this) has some Japanese subtext to it. Most of my readers probably do not speak Japanese, so the text is the following:
私たちは日本語も話せます。ちょっと。

If you can read or translate Japanese, all the better for you (and if you can, please tell me if my grammar is decent!) ┐('~`;)┌

But why did I make my header in Japanese? Well, because Japanese is an amazing language. It is simple, logical, expressible, and fun to read and write. Japanese culture is also very attractive to me. Unfortunately, as of now, I can only fluently speak English (obviously) and Spanish, to a degree. Pero el español, no es muy exótico en comparación con japonés. Chinese is alright, but I'm not as much of a fan of the culture (or government...) and that there exist none (that I know of) strict syllaberies whereas Japanese has two, and Korean hangeul is a little too simple (although I have yet to learn it). The Indic languages are really interesting, but a bit too complicated and of more limited use for me. Middle eastern languages I may toy with eventually, although I'll likely get put on an FBI watchlist. And Cryllic languages... well, I'll get to them soon enough (Russian, anyone?)

Some examples of each. If you can't see some of the characters, then your computer sucks. On Windows, get the respective language packs installed, on Linux do the same (in Language Support), and Macs should already have them all, but I wouldn't really know:

An example of the Japanese language. 日本の言語のサンプル。
An example of the Chinese language. 一個例子中文。
An example of the Korean language. 한국의 언어의 예.
An example of the Hindi language. हिंदी भाषा का एक उदाहरण.
An example of the Arabic language. مثال على اللغة العربية.
An example of the Russian language. Пример из русского языка.
And my favourite, the Tibetan script (not sure what language or what it says):
།དས་ཀ་རད་པ་ཞ་བའ་སན་ལམ་

Thursday, June 25, 2009

I Love Linux, But It's a One-Sided Relationship

So I love Linux (specifically Ubuntu), but it can be a pain in the arse, a lot. Its beauty, power, and price are unmatched by anything else I have ever seen, including Windows and Mac. But its complexity rivals that of good jazz music. However, I can always forgive it :D
Linux comes with a massive userbase within its network of forums and how-to articles and wikis. It's all about searching for the right info, and you will find what you need. Hint: Google is my best friend.
Today Compiz, Ubuntu's manager of all its awesome beauty and special effects, is bothering me. It won't save my custom profile! But that's what the Compiz forums are for! I bet I'll have it fixed faster than Steve Jobs can put on his turtleneck.
EDIT: Yep, fixed it. Thank His Holy Noodliness for those who volunteer their time and skill to help others.

Here's an example of what Compiz can do. Of course, you'd need much more than 2:25 to show all its powers. The potential is astronomical. Thinking of switching yet?