Digital Comic Museum

Help and Support => Feedback and Suggestions => Topic started by: sevengates on May 15, 2013, 10:32:09 PM

Title: file numbering format
Post by: sevengates on May 15, 2013, 10:32:09 PM
Would it be possible for the format for the naming of all the files posted to DCM to be coordinated, with DCM setting a single standard that everyone could follow? 

Many of the issues posted do not have their full titles and sometimes have only a few letters and no number or they lead with the publisher or a code number.  All of this makes it difficult to find and sort files downloaded from DCM. 

I think a good format could be something like -- full title of the series -- followed by issue number -- followed by any variant descriptive information, such as the year or number of pages.  Also there should be a single standard for spaces, such as the underline symbol "_".

So an example would be: Horrors_At_Midnight_004_1952_36pp.cbr 

If all posters followed such a simple standard format, think of what a pleasure that would be when sorting your downloaded files!

DCM is a superb site, and I love it the way it is, of course.  However, I think this one little change could elevate its functionality up a couple of notches. 

Thanks for listening to my opinion.  Looking forward to your thoughts on this matter.
Title: Re: file numbering format
Post by: JonTheScanner on May 15, 2013, 11:03:17 PM
Would it be possible, sure. Is it likely to happen, no I doubt it.

More importantly, I'd hate to see any animosity develop about the "right" way to name files. I wouldn't think it would, but then back when several of us were getting GCD started, I never could have imagined how much rancor would develop over things like whether the genre is "funny animal" or "anthropomorphic" either. This is a wonderfully pleasant group, and I'd really hate to see anything happen to it.
Title: Re: file numbering format
Post by: John C on May 16, 2013, 04:08:19 AM
I feel like we already make people's contributions feel unwelcome (even though they're very welcome) with what few rules we keep to.

More importantly, there is no standard that makes more than a small minority happy.  I want spaces in filenames, just the title and number, the number zero-padded to the length of the series, and (if needed) pages missing.  Some people want everything about the comic, like the publisher, year, scanner name, how it was edited, and so forth.  Yours is different than both extremes.

The net result is that (almost) everybody is going to need to rename the files they save anyway.

I have a possible solution brewing, but I need the time to work out the details and get it running.  I'll mention it when I have something to show.
Title: Re: file numbering format
Post by: sevengates on May 16, 2013, 06:57:17 AM
Your replies are very appreciated.  Thanks for your insights into the etiquette of posting to this site.  As you say, nothing is more important than the contribution of the files to the site with all other considerations being secondary.

That being said, your answers made two ideas come to mind.  Perhaps a simple suggestion (and not a rule), something like "please try to put the complete title first when naming your files" might be accepted by the majority of uploaders?.

Also, is it possible for the administrators of DCM to rename new files?  The uploads are processed to some degree when you put them up on your site, so why not rename each file at that time? 

Hope you don't mind the further discussion of this subject, as I feel it would make a difference to the site. 

Your site is wonderful and my ideas are submitted with all due respect.  Thank you for your kind attention.
Title: Re: file numbering format
Post by: sevengates on May 16, 2013, 08:27:51 AM
Upon re-reading the above comments, I realized I should clarify my original comment.  My intent was not to suggest a required uniform format of my design or any design, nor was it to eliminate the need to re-name files once down-loaded.

My point, which perhaps I did not make clearly, was that once downloaded, some files are hard to even identify or find.  Afterall, there are quite a few files at DCM with names like "erwt23ter(sfe).cbr" or "3553___et___wrt.cbr" or even "00.cbr". 

So the only point I wanted to make was that it would be helpful to place the series title at the beginning of any given file name.  I agree uploaders should not be required to follow a lot of rules, but perhaps making a gentle suggestion might help more uploaders to realize the usefulness of having at least the series title at the beginning of a file name.

Thanks for listening to my idea.  Love your wonderful site.
Title: Re: file numbering format
Post by: movielover on May 16, 2013, 08:39:59 AM
I might be wrong, but I believe the files would have to downloaded, have the name changed, then re-uploaded to the site.

The best bet is just rename the file when you download it. Sounds simple, but is the least time consuming option. Trust me, downloading, fixing/renaming and re-uploading is a time consuming process.
Title: Re: file numbering format
Post by: builderboy on May 16, 2013, 09:46:24 AM
This is interesting from the perspective that 1) I do rename all of the files once downloaded, and 2) I was one of the members who helped rebuild the old GAC site (post Yoc) when it experienced an catastrophic melt-down.  One of the massive hassles of that effort included a heap of files (thousands, all in one directory) with unidentifyable names.  You had to open each one, ID it, rename it, then re-upload it into the appropriate directory.  I don't know enough about DCM to know that there is any risk of that, but, what a mess GAC had!!

I do echo the comments about the difficulties of trying to establish a standard, and the efforts that it would take to rename the files to conform once you had it.

But, just for grins, what are some of your personal naming formats?  I always look for my books to naturally sort to the order in which they were published, so use the following format:

YYYY-MM_TITLE (v#) iss# [page count] Poster.cbz

With the page count, it will either be [c2c 36p] or [56 of 68p].  This way, I will know if upgrades come out.  If it is a fiche scan, the issue # will be followed by an "F".  I haven't worked out a good method of handling mixed paper/fiche files.

I would be interested in hearing other takes.  I know some folks put the publisher in there as well.
Title: Re: file numbering format
Post by: John C on May 16, 2013, 10:20:21 AM
My point, which perhaps I did not make clearly, was that once downloaded, some files are hard to even identify or find.  Afterall, there are quite a few files at DCM with names like "erwt23ter(sfe).cbr" or "3553___et___wrt.cbr" or even "00.cbr". 

Ah...I forgot about those.  Sorry about that.

Nobody does that on purpose, and no scanner or uploader selected those names.  Some part of the system crashed, a long while ago, and the database lost the upload names.

Unfortunately, as Movielover points out, the "solution" is to download it, rename it, and upload it again.  We occasionally try to run a dragnet through them, though, so please keep track of the ones you find, since we can't see them without downloading, and we don't want to download the books automatically to check them, since that costs us in bandwidth.
Title: Re: file numbering format
Post by: Yoc on May 16, 2013, 10:43:01 AM
ML is correct.  The file names (the name of the actual file as it appears when it's finally downloaded to your hard drive) is not set by DCM but by the original uploaders.  To change them would be a Massive undertaking.
I too have some ideas about future uploads but for now the only solution is for users to rename downloaded files for their own future use.

I sure wish there was a fast easy solution.  But if there were the arguments about what the name rules should be could be nasty too.

-Yoc
Title: Re: file numbering format
Post by: sevengates on May 16, 2013, 05:16:24 PM
Quote
But, just for grins, what are some of your personal naming formats?  I always look for my books to naturally sort to the order in which they were published, so use the following format:
YYYY-MM_TITLE (v#) iss# [page count] Poster.cbz
With the page count, it will either be [c2c 36p] or [56 of 68p].  This way, I will know if upgrades come out.  If it is a fiche scan, the issue # will be followed by an "F".  I haven't worked out a good method of handling mixed paper/fiche files.

My take on a good comic book file name:

Definitely series title first, complete and with proper spacing between words

Next, issue number, and with volume number if applicable

Next, an indication of completeness, such as c2c or 36p, but just the number of pages in the file usually tells the tale

For the rest, publisher and date are nice

So an example of the simplest file name format would be:  The_World's_Biggest_Comics_009_36p_c2c.cbr

A little fancier version with extra info that is not essential:  The_World's_Biggest_Comics_017_28p_inc_Charlton_1954.cbr

My only goal is to be able to sort by name, not by publisher or date.  For me, that seems the best way to control huge numbers of comic book files.

Anybody got any other approaches that are interesting?
Title: Re: file numbering format
Post by: cimmerian32 on May 16, 2013, 05:52:37 PM
My standardized filenaming system is

Name of title 0003 [date-xx.publisher](c2c.scanner)
Title: Re: file numbering format
Post by: Mark Warner on May 17, 2013, 04:03:09 AM
This is an interesting idea! My initial thoughts are:

1) The file name is actually held in the database, so an admin page that allowed you to change the name would mean that you would not need to delete and reupload :)

2) I wonder if in fact no "work" needs to be done apart from writing a bit of a script when a user clicks on download to then automatically generate the name. Hmmmm.....

3) On CB+ I am altering .zip to .cbz and .rar to .cbr. Which is cool, BUT the downside to this is that I am creating new file names and am increasing the chances of duplicates. If you were to change all DCM file names then it could potentially lead to some confusion.

4) I think putting the GCD reference number in would be rather nice and also very useful from a programming point of view :)

5) Maybe scanners may not like the file names being messed with. Plus what about all the files that people already have on their harddrives (relates to point 3).

Just pondering ... not sure if it is a good or bad idea. But summat I think we might look at on CB+
Title: Re: file numbering format
Post by: sevengates on May 24, 2013, 07:22:12 AM
Quote
BUT the downside to this is that I am creating new file names and am increasing the chances of duplicates. If you were to change all DCM file names then it could potentially lead to some confusion.

On the contrary, I think renaming all files with the single formating standard of putting the series title first would help eliminate confusion and duplicates -- that is because there is so much variation of file name formats currently at DCM that files can't be properly sorted or identified, and in many cases dupes exist already -- certainly in the personal files of downloaders.  I know they do in my files, and it is very time-consuming to sort them out because of weird file name variations, some of them even in code, such as "trtr454__ere.zip". 

If DCM automatically renamed every file with series title first, it would allow all files to be sorted precisely, easily identified, and all dupes to be eliminated.  Then in your personal files, after you download a file, when you rename it for your own storage all dupes will show up and can be eliminated.

In short, without getting fancy, please just ask everyone politely to: "Please Put The Series Title First" on every file uploaded.  Thank you for your kind attention.
Title: Re: file numbering format
Post by: John C on May 24, 2013, 03:27:48 PM
In short, without getting fancy, please just ask everyone politely to:

Excuse me.  I tried to be polite about this, and so have others, but apparently you didn't want to hear it.  Excuse me for being a little more blunt.


I don't want to suggest a "when you're paying the server bills every month, you get to make the rules" kind of situation, but as I said, everybody's a volunteer here.  Nobody is going to take direction from someone whose sole contribution to the community, to date, is to declare what they should do.

You've said your piece.  If there really are scanners hand-typing names that aren't useful to you, they've seen your complaint and it's up to them.
Title: Re: file numbering format
Post by: Yoc on May 25, 2013, 10:42:48 PM
It's hard to get consensus on many matters among those in our hobby.  If there were a way to easily (yes, I know, nothing is easy with code writing) automate the upload process to rename the zip/cbz files as they were uploaded.  I've even suggested a format to our head code writer but in the grand scheme of running the site it's not near the top of his list.  There are several other features that we would rather he work on to improve the site before trying to tackle this one.
I appreciate your thoughts on the matter Seven but for now we are just happy to have members wanting to share their scans.  We aren't going to criticize them for their file name formatting.
-Yoc
Title: Re: file numbering format
Post by: bchat on May 29, 2013, 08:14:52 AM
Would it be possible for the format for the naming of all the files posted to DCM to be coordinated, with DCM setting a single standard that everyone could follow? 

I think a good format could be something like -- full title of the series -- followed by issue number -- followed by any variant descriptive information, such as the year or number of pages.  Also there should be a single standard for spaces, such as the underline symbol "_".

So an example would be: Horrors_At_Midnight_004_1952_36pp.cbr 

If all posters followed such a simple standard format, think of what a pleasure that would be when sorting your downloaded files!

That wouldn't help me sort my downloaded files at all.  I would still need to rename them, since the naming/sorting format I use is not the same as yours (date, publisher, then title & issue number).

I'm not knocking the idea of there being a standard way of naming the actual files, because however it's done, I still need to rename every file I get, so it really doesn't matter to me either way.  But, it takes two seconds for someone to rename a file after they've download it, and you're asking anyone who isn't you to spend countless hours in order to identify, download, rename and then upload a file to meet your standard ... without getting paid for doing it.  Call me silly, but I don't think that's going to happen.
Title: Re: file numbering format
Post by: Yoc on May 29, 2013, 10:48:20 AM
It can only happen if it's somehow automated.  I'd rather our code master concentrate on new exciting features than something like this.  As nice as it might be.

-yoc
Title: Re: file numbering format
Post by: sevengates on May 31, 2013, 07:13:22 PM
I guess my last comment didn't sound like I meant it to sound.  I apologize if it sounded demanding, which was not my intention. 

I only meant to bring attention to the concept of file naming.  I thank everyone who took the time to consider the issue.  I understand what you are saying re why files have the names they do. 

DCM is a fabulous site, and thanks to all who make it happen.  Peace out.
Title: Re: file numbering format
Post by: garbanzo on December 16, 2017, 02:14:59 PM
Lazy filenames are my only complaint about this site. Why can't a standard be enforced? Scanner release hubs on DC++ enforce file naming conventions, and those hubs can see several hundreds of new files uploaded in a single day. They have no problems enforcing a convention. Now consider the fact that the number of uploads per week at DCM can usually be counted on one hand - that's hardly a chore to moderate. There's no need for automation. If the site needs volunteers to moderate uploads, just ask. I'm sure you'll get plenty of responses.


Of course you'll never find a standard that pleases every user. Nobody can. And no file sharing sites I am part of even considers this as something to be concerned about. They set the standard and enforce the standard. If users want to rename files on their own, that's their prerogative. But at least you can look at a file you've downloaded and actually know what it is. Here, there's no telling with some files. cmj34.zip - any clues?


A comic book's indica page tells us its title. The issue number is found on the indica and/or the cover. A filename should at the very least start with the full and proper tile, followed by a space, followed by the issue number, followed by another space if anything comes after that.


If the scanner/uploader wants to zero-pad the number, or to add issue year and/or month, or stick on a bunch of other stuff like publisher or scanner/editor tags or random characters, that's fine. But the root filename can and should be standardized.


Also, why allow ZIP and RAR files, when CBZ and CBR are standards?


That's just my two cents. Take it or leave it :)
Title: Re: file numbering format
Post by: OtherEric on December 16, 2017, 04:10:09 PM
Not all books have numbers.  Not all books have the indicia agree with the title.  There are lots of titles that had multiple series, and your proposed filename might not cover some of those duplicates. For the life of me, I don't know why everybody doesn't rename files to CBZ or CBR but they don't.

This is an all-volunteer operation.  Even if we had a standard naming convention, enforcing it would be a problem.  The way our software works, it's surprisingly hard to change the name of the file; the only way I know is to download it, rename it, and reupload it.  It's a tech limitation.  Even more importantly, there are scanners who would take offense if we renamed the convention they use on their books.

Even if we got past all that, one of the reasons we only get a few scans a week is because such a high percentage of books that are usable on the site at all are already uploaded.  A few files a week wouldn't be difficult... but renaming literally thousands of older scans would be.

Also, there are people out there who track what books they have by the names of the scans, and particularly what version of the scan if multiples versions exist.  Renaming the files would create confusion for a lot of people over what they already have.

If we were starting this project all over, yeah, we might have tried to create and enforce a standard.  At this point it would probably do more harm than good.

Oh, and I'm willing to bet cmj34.zip is Captain Marvel Junior #34 or 35.  (Those issues got merged into one for reasons lost in the mists of time.)  But I'm deliberately not checking until after I post this.
Title: Re: file numbering format
Post by: OtherEric on December 16, 2017, 04:17:27 PM
And for what it's worth, I seem to have been wrong on the cmj34...
Title: Re: file numbering format
Post by: garbanzo on December 17, 2017, 05:41:20 AM
As I re-read my post with fresh eyes this morning, it struck me that I may have come across as ungrateful. I am certainly not! I adore this site, and I'm indebted to everyone who makes it happen, from the buyers to the scanners to the editors to the uploaders to the coders to the staff. I'm just offering my perspective as a user. Moreover, in the real world I work as a processes improvement project manager, so I'm rarely content with how things are, when I can imagine how things could be. And I stubbornly refuse to accept the idea that things ought not change sake of appealing to the phrase "that's how they've always been"!


That said, I really don't think things are as complicated as you suggest. Yes, there are exceptions. And if you try to build your rules around those exceptions - those few odd books, or those few picky users - you'll give yourself a headache. It's easier to let them be what they are - exceptions to the rule.


I refer again to some torrent trackers I belong to (all of which are also run by volunteers). One in particular hosts 1.2 million torrents. Rules are in place about those uploads - what type of file they can be, how the files are named, what folder structure is used - and by golly, it works! All 1.2 million files conform to the rules.


You do make a good point that most files that might be uploaded here have already been uploaded here, and that going back and changing them would be a chore. You also suggest that things might be done differently if you were to start over from scratch. Why not consider that? This site is great as a massive, lightly-organized dump of scans, but with some thoughtful curation work, it could become a truly amazing archive of history - a "museum" in the truest sense of the word.


Anyway, I need to get back to cataloging my own collection. Let me know if you ever feel like starting fresh, I would love to help.
Title: Re: file numbering format
Post by: Captain DJ on December 17, 2017, 05:56:20 AM
Uploading in rar or zip format being bad I agree with, we need to do more about this. We could even script it just to change file extensions from rar to cbr on the server side after they been uploaded or could go as far as banning .rar / .zip uploads.


Most of this goes back to the time where we were happy to have any upload so wasn't overly worried about the format / filename etc as long as we had it on the site
Title: Re: file numbering format
Post by: Yoc on December 17, 2017, 10:06:19 AM
If a script to rename rar and zip files to the cbr/cbz file name were easy to do please feel free Capt.
But I don't think we should 'ban' them.  Some people out there might have no choice (as far as they know) in what format they share with.

-Yoc
Title: Re: file numbering format
Post by: Snard on December 18, 2017, 05:53:10 AM
I agree with Yoc's statement about not banning any particular extension (i.e. .rar or .zip). I personally keep all of my comic archives named as .zip, since I don't use CDisplay, but rather choose to use built-in Windows tools for viewing the files (Windows natively supports the .zip extension). When I do uploads to DCM, I try to remember to make a copy of the file, renamed as .cbz, before I upload. However, if there were a script or action on DCM that took care of the renaming for me during uploads, I would welcome that, since it would be one less thing to remember to do.  (As a side note, the upload service for DC++ does something like this.)

Thanks for all you do.
Title: Re: file numbering format
Post by: Captain DJ on December 18, 2017, 12:12:02 PM
Have changed code so when a user clicks download a .zip files changes to .cbz and .rar files change to .cbr
So no .zip / .rar files should now ever be offered to downloaded


No change on uploads, users can still upload .zip and .rar files.
Title: Re: file numbering format
Post by: Yoc on December 18, 2017, 03:06:29 PM
Thanks very much Capt.  A nice update for those especially unable to change file formats for any reason.

-Yoc
Title: Re: file numbering format
Post by: garbanzo on December 18, 2017, 04:36:36 PM
A welcome change! Thank you all for for considering, discussing, and implementing  8)
Title: Re: file numbering format
Post by: Captain DJ on December 18, 2017, 05:19:44 PM
We always welcome suggestions, if community want the change then we will make it
Title: Re: file numbering format
Post by: Captain DJ on December 19, 2017, 03:36:34 PM
We have made one further change and we have changed the upload function to now detect the file type being uploaded and will auto fix common file extension problems.


For example:
a .zip file uploaded as a .cbr file will be auto fixed to a .cbz file
a .rar file uploaded will be auto fixed to .cbr


This should reduce work on staff and stop common problems with Comic Preview feature etc.



Title: Re: file numbering format
Post by: Yoc on December 19, 2017, 04:34:34 PM
Thanks very much for working on this Capt!
:D
Title: Re: file numbering format
Post by: OtherEric on December 22, 2017, 12:50:05 AM
A welcome change! Thank you all for for considering, discussing, and implementing  8)

Please allow me to apologize for having replied somewhat harshly to your original post, by the way.  At least one of the problems where I didn't have an easy solution did turn out to have one.

Did you ever figure out what the cmj34.zip was?
Title: Re: file numbering format
Post by: garbanzo on January 06, 2018, 06:10:01 AM
A welcome change! Thank you all for for considering, discussing, and implementing  8)

Please allow me to apologize for having replied somewhat harshly to your original post, by the way.  At least one of the problems where I didn't have an easy solution did turn out to have one.

Did you ever figure out what the cmj34.zip was?


No worries at all. I can have a big mouth sometimes if I think I see something happening that I think is inefficient or ineffective. Since I'm not a frequent poster here, maybe I could have eased into the discussion a bit better.


cmj34.zip was just an example I concocted, not a specific filename. But your first guess was basically correct, I had just finished sorting Captain Marvel Jr, and one of the files was named similarly.