1. Please read the rules on SDK and Recovery files for all systems. There are legal differences between direct links and links to other sites. Please read the rules to understand the difference between a download and a link. Dismissing this notice will acknowledge you are aware of the rules.
  2. File sharing rules have changed please read the rules again. Primary changes involve Xbox 360 and PlayStation 4. Dismissal of this notice is a recognition of the change of rules.

ACCLAIM'S complete file archive rescued

Discussion in 'File Downloads - Share and Request' started by ASSEMbler, Mar 30, 2007.

  1. ping

    ping Guest

    This thread caught my attention just now and I would like to share my
    thoughts on this process. Excuse me if I repeat some things that have been
    said before. Also I'm expecting to see some 'who the heck is that guy'
    replies: yes I am new around here.

    As the amount of data is indeed quite massive the structuring does indeed
    require good organization. Merely having the files online won't help much.
    Factors that have to be taken into account which I can think of right now
    are hardware, operating system, network, organizing the collaboration,
    access/access-restrictions, documentation, software, responsiblities.
    I'll try to say something about each of these points.

    -------------------------------------------------
    (1) Hardware:
    -------------------------------------------------

    From what I could gather from reading the thread the server is a Dual-P3
    with 1.5GB of RAM and harddisks amounting to 280GB ("Two 80GB, a 40GB, a 160 GB").

    Given the vast amount of data to crawl through, I think the 280GB won't cut it.
    On the one hand one could argue, that it is not necessairy to have all the
    data available at one time, then again if the data is really completely unsorted,
    files belonging to one projects might be spread over various discs. If only parts
    are available it might be hard to fit those into context. I think several people
    have offered to submit either hardware of cash - might be a good idea to actually
    take use of these offers.

    As for the system power, it really depends on how the rest of the project is laid out,
    but a Dual-P3 could certainly be forced on his knees in such a scenario. However it
    really depends on how many accounts will be given out and which services will be provided/
    running on the system.

    -------------------------------------------------
    (2) Operating System
    -------------------------------------------------

    As for an operating system i'd strongly suggest to go for a Linux installation. This allows
    for easier remote configuration and administration. Also the necessairy software is available
    for free. Another point is, that the organization of the project will most certainly require
    a good deal of scripting which you certainly don't want to do with dos batch scripts. Lastly
    I'd claim that you are likely to yield better performance with a slim server-installation of
    redhat or SuSE than with, say Microsoft Server 2003. (But please lets not drift into a linux
    vs microsoft discussion)

    -------------------------------------------------
    (3) Network
    -------------------------------------------------

    Again the requirements here depend on the dimensions that the project shapes into. However I think
    that several mbits would be a must-have. If you have 10 people logged in going through/downloading
    data on a 1mbit line.. that would be rather uncomfortable.

    -------------------------------------------------
    (4) "collaboration / organization"
    -------------------------------------------------

    Once servers are setup and running the 'real fun' starts: Figuring out how to categorize/identify
    the big mess. For this process I have the following thoughts :

    (i) Global Index:

    First of all I would recommend to create a global index over all data. This would
    require to first uniquely label each available (physical) media. Each media is then scanned creating
    a file index, possibly with the following information:

    [media-id] [filename] [filesize] [md5 checksum] [file-type-guess]

    Where media-id is the just mentioned media-label. [file-type-guess] could be a combination
    of what *nix `file` command thinks of this and perhaps what ucon64 thinks. Although I expect
    that binaries, which ucon64 recognizes would be a clear minority of the files.

    This index could then be used as a starting point. First of all it could be used to identify
    duplicates(backups). Ideally I would place this index in a mysql database for referrence and
    querying.

    (ii) Divide-and-conquer:

    Obviously the work has to be split among groups/people to be managable. Therefor i would suggest
    to assign groups on different parts of the data based on the global index. Some Individual(s) should
    take care of assigning chunks of data to the willing.

    "So what to do with that chunk of data" -- I would suspect that in most cases data corresponds
    to some specific project. In that case the goal should be to identify what a given set of data belongs to,
    what the state of the project was, and if the files of the project appear to be complete (might be hard to judge)

    Groups: Each group should have one administrator which, for example has access to the database and/or read+write
    permission on the data storage. Obviously this would require some knowledge with the used software, and also
    good knowledge of the platform the group is dealing with.

    (iii) Data-Organization & Accounts

    Unidentified raw-data should be availale to the group-admins sorted according to the info of the global-index in
    the mysql table. I would avoid providing each group too much scratch-space on the server itself. I'm tempted to
    suggest to store the raw data in a svn repository and use locks to prevent people from working on the same files. This
    however would require a lot more additional HDD space. Therefor (s)ftp might have to do.
    (The suggestion of VNC accounts in the forum gave me the shivers.. )

    Structure:

    When an group-admin decides that they are done with their chunk he creates a new directory for the sorted data and
    the correspondence is recored in the global index. The now categorized raw data is now either moved or locked. The
    (dir) structure of the 'cleaned/sorted' data will largely depend on the contents of the data. This is something
    that should be decided after examining the global index.

    (iv) Logging the process

    The process of each group should be documented. Either using a wiki page, or maybe just as part of a svn repository.


    -------------------------------------------------
    (5) The Rest
    -------------------------------------------------

    As I expected I already lost the overview on things I wanted to point out. I'll just summarize everything else I have
    in mind now in random order ..

    * Some might think that i'm going to far with my suggestions, but I think just putting the files online somewhere and taking the
    brute-force approach won't cut the mustard, ie will end in chaos.

    * Work: It looks like the whole thing will require quite some work..
    - setting up the linux server
    - setting up a database
    - creating the index
    - setting up svn and a wiki page
    - maintaining and designating "jobs" to 'group-admins'
    - maintaining the ftp server
    - actually going through files identifying them: Each group should probably have at least one person with
    some programming skills.

    * ssh : group-admins shoud preferably have ssh accounts, while the others only have ftp access

    * traffic-shaping: Some traffic shaping would certainly be a good idea..

    * Distribution: When the files are done someone should probably take care of creating torrents for distributing the data

    * costs: The whole process could cause a LOT of traffic.... $$$

    ------------------------

    That's it. That was a rough sketch of what I had on mind when I discovered this thread. Any comments? :)

    PS: Personally i'd be most interested in seeing some SNES stuff ;]
     
    1 person likes this.
  2. opethfan

    opethfan Dauntless Member

    Joined:
    Dec 13, 2006
    Messages:
    753
    Likes Received:
    2
    Holy shit! That's a lot of detail. I don't know how much is truely relevent to Assembler's plans, but it definately seems like a great plan to build on.
     
  3. kammedo

    kammedo and the lost N64 Hardware Docs

    Joined:
    Sep 24, 2004
    Messages:
    2,138
    Likes Received:
    12
    Linux is too late I fear - Kev already put windows on it.....As I told before, using linux would improve performance, but would possibly not get you access to all software...
    That could still be handled by FTP now that I think about it....
     
    Last edited: Apr 23, 2007
  4. ping

    ping Guest

    ftp should be avoided whenever possible :)

    And I seriously can't imagine pulling this whole thing off under
    windows to be honest... but well, I'm only a "spectator" at this
    point anyway..
     
  5. ServiceGames

    ServiceGames Heretic Extraordinaire

    Joined:
    Nov 20, 2005
    Messages:
    1,218
    Likes Received:
    5
    I don't quite understand your reservations to using windows software.. It may not be the best option, but there are thousands of businesses that use their OS's in a similar manner to what Kev is looking to do.
     
  6. ping

    ping Guest

    well..

    1. performance
    2. ssh/glftpd/subversion/bash/python/mysql/apache/wiki

    it's not like there is some standard solution software to this project, and
    tailoring everything as needed would most certainly be easier under linux, at
    least that's what i am thinking. Linux surely can be a pain in the arse in many
    aspects, but this time i'd consider it as better solution.
     
  7. retro

    retro Resigned from mod duty 15 March 2018

    Joined:
    Mar 13, 2004
    Messages:
    10,354
    Likes Received:
    822
    OK, I got the blog sorted to a point where I'm relatively happy with it.

    Kevin, I'll contact you when you're on next and see what you think. ;-)
     
  8. APE

    APE Site Supporter 2015

    Joined:
    Dec 5, 2005
    Messages:
    6,416
    Likes Received:
    138
    Linux kicks Windows to the curb when it comes to servers, problem is if you don't know every facet of Linux it's really hard to properly lock it down.
     
  9. kammedo

    kammedo and the lost N64 Hardware Docs

    Joined:
    Sep 24, 2004
    Messages:
    2,138
    Likes Received:
    12
    Have to correct you on that : Linux is already locked down by default, problem is you can't unlock it properly if you don't know each facet of it :lol:.
    Windows isnt that bad at all. On my side, I popped Linux in the discussion when the security issues were mentioned. Nothing more and nothing less.
     
    Last edited: Apr 24, 2007
  10. ASSEMbler

    ASSEMbler Administrator Staff Member

    Joined:
    Mar 13, 2004
    Messages:
    19,394
    Likes Received:
    995
    There's some huge problems with the dsl line,sustained data is nonexistant.

    I'm probably going to have them runa new line to the house, or if they won't do that, cancel and buy business cable.

    However that is $70 a month versus $19
     
  11. kammedo

    kammedo and the lost N64 Hardware Docs

    Joined:
    Sep 24, 2004
    Messages:
    2,138
    Likes Received:
    12
    Cable would be great. FTP would work just flawlessy then.
    But 70 bucks a month is, well, expensive. Hopes to get some kind of special offer?
     
  12. karsten

    karsten Member of The Cult Of Kefka

    Joined:
    Mar 14, 2004
    Messages:
    4,015
    Likes Received:
    149
    another important thing is "do we REALLY want to work on sorting ALL of that datas?"

    i mean, having 120 folders containing the WIP of the same game is really necessary? or we might just workthe sorting on the firsts and last builds?

    and another thing that is IMPORTANT to do is SORTING out the volunteers so that work can be plnned in advance. I mean something like all the volunteers compiling a form like

    tech knowledge (for documentation)
    3d programs knowledge
    able to work and sort on pics
    Programming knowledge (c, c++, VB, java etc)

    and so on...

    that would allow us to count the volunteers and check what the volunteers can ACTUALLY do.

    karsten
     
  13. kammedo

    kammedo and the lost N64 Hardware Docs

    Joined:
    Sep 24, 2004
    Messages:
    2,138
    Likes Received:
    12
    Count me in

    tech knowledge : Informatic engineering
    3d progs : general knowledge (CV - and some 3D background)
    able to work and sort on pics - well on this everyone shouldnt have a problem ;)
    Programming knowledge :
    c, c++, c#, java, Delphi, VB, SQL, perl (some), php (knowledge but less than no experience)
    Interested in (you forgot this one karsten ;) ): SNES related stuff
     
    Last edited: Apr 24, 2007
  14. karsten

    karsten Member of The Cult Of Kefka

    Joined:
    Mar 14, 2004
    Messages:
    4,015
    Likes Received:
    149
    maybe you haven't noticed it, but i'm Mr. nothing here :D i'm not ordering, planning, oranything else. i'm not the boss! :D

    i'm just offering suggestions and my help in the task
     
  15. kammedo

    kammedo and the lost N64 Hardware Docs

    Joined:
    Sep 24, 2004
    Messages:
    2,138
    Likes Received:
    12
    Bah, the only thing that counts is passion :lol:
     
  16. karsten

    karsten Member of The Cult Of Kefka

    Joined:
    Mar 14, 2004
    Messages:
    4,015
    Likes Received:
    149
    moved?
     
  17. Omar

    Omar Robust Member

    Joined:
    Mar 15, 2004
    Messages:
    274
    Likes Received:
    24
    We could also consider having the data mirrorred off-site.
    Like, Kevin fiilling a few hard drives for those people (who have the space), so we can have several servers for downloading, and an obvious backup source.
     
  18. kammedo

    kammedo and the lost N64 Hardware Docs

    Joined:
    Sep 24, 2004
    Messages:
    2,138
    Likes Received:
    12
    That would mean splitting data up. And possibly create even more confusion.
    Not to mention leakage and security issues.
     
    Last edited: Apr 24, 2007
  19. ASSEMbler

    ASSEMbler Administrator Staff Member

    Joined:
    Mar 13, 2004
    Messages:
    19,394
    Likes Received:
    995
    I'm up to my eyeballs with traceroutes atm. Soon as I get them to run me a new line and /or change the curcuit I am on.

    I got the 8 port vpn router today.
     
  20. kammedo

    kammedo and the lost N64 Hardware Docs

    Joined:
    Sep 24, 2004
    Messages:
    2,138
    Likes Received:
    12
    Great! Can't wait ^^
     
sonicdude10
Draft saved Draft deleted
Insert every image as a...
  1.  0%

Share This Page