Time showed us that there's never enough computing power(I'm NOT talking about browsing the Internet or writing a text file here...), but what can we do in order to achieve our goals using computers as fast as possible?! there are a few options(off the top of my head):
1. buy better computers
2. use any computer you can get you're hands on
1. We always buy better computers in order to do stuff faster but there are a lot of limitations:
a. budget: we can buy STA(state of the art) computers with 4, 6, etc. cores that will make our life easier, but is this really a good idea?! the answer is NO, buying a i7 at 3 GHz with 4 cores it's about $ 3-400 depending in which country you live, now 3 Ghz with 4 cores is not the fastest you can get, Intel has way better CPU's than that -- extreme series, they also try to get as many cores as they can into a CPU but let's just stop at the extreme series which costs about $ 1.000/CPU(of course it worths the price, but it depends on your needs) -- now this is a lot just for a processor but depending on you're budget you can buy or skip.
b. operating system: some OS's are better than others -- depending on your needs of course -- let's take Windows for example, it is a very good OS for entertainment and office, but when you need to do some tasks that takes hours/days/weeks to complete is it good?! I honestly can't give a definitive answer on this because for tasks that needs a lot of time to complete I turn to my geek friend Linux -- it is very stable, it manages resources very well and if you don't need GUI(graphical user interface) it's pretty much rock-solid.
2. What do I mean by "use any computer you can get you're hands on"?!
It's not a secret that a lot of companies connect a bunch of computers together through a communication protocol and use each computer as a thread -- WAIT!! how does this work?!!
Basically it depends on the developers... you can have a system that is the Master on which you execute special programs and sends task execute request to 2 or more Slaves, when a slave completed it's task, it sends back the result to the master and waits for another request from the master -- pretty simple ey?! in essence yes, in practice NOT!!
Here is the basic idea:
step 1. Master => send request => slave(s)(1..N computers) -- usually at least 2!!
step 2. Master waits for all slaves to complete the tasks
step 3. when a slave completes the task it sends result back to the Master
step 4. Master processes result(s)
Fairly simplistic right?! but why do I say "at least 2 computers"?!
Over time we have been Witness hardware failure(I'm proud that I haven't had too many -- yet!!) let's say we got a highly intensive task that we believe that it will take "forever" to complete a matter of days, WHAT IF in this time one of the slaves has a hardware failure?! you've lost shit-load of time and we all know the equation:
time = money -> lose time => lose money another way to see this is: the less time you spend on doing something, the more money you earn.
Sooo... let's review what is one of the best approaches you can take when you need huge computing power:
1. get as many systems as you can -- no matter how powerful the CPU is or how much RAM the system has
2. implement the logic and the communication protocol(avoid using hard disks as much as possible <-- slowest part in the computer) 3. start using you're new hardcore computer network 4...N. always improve the idea!!
Now, let's try to throw some ideas of a possible implementation:
- create a flexible communication protocol(I prefer using TCP/IP because you can have GB's of data transfered in second(s)) maybe use XML?!
- choose the cleanest Linux distribution you can think of -- avoid using GUI for better performance(on slave side)
- implement integer(huge integers -- that can grow up to trillion digits long), string(huge strings that can be concated from 2 or more slaves), object(which has it's own methods which will be transfered along with it from master-slave, slave-master, slave-slave), etc.
- use some kind of ping mechanism so that the Master is automatically "knows" when a slave is dead and take appropriate actions(send task to another slave, e-mail tech department, etc.)
- Master CAN NOT execute task -- it needs only to assign tasks to slaves and communicate with them
- if you try hard enough you can also make the slaves "know" when the Master has a failure and another "free of task" slave can take it's place
- you will have to use a very fast interpreter
What do we get out of this?! well some of you know that you can buy good old Pentium 4 computers at 2.x-3 Ghz with 512 mb or 1 GB RAM for ~$ 100) -- WAIT!! so I can have 10 cores at $ 1.000?!?! yup...
You can also use implement this in such a away that you can use virtually any OS -- YES you can have 2 slaves on Windows 2000, 5 slaves on Windows XP, 20 slaves on Linux, 8 slaves on OSX, etc.
Sooo... the "hardcore" system can have a lot of slaves, running on multiple platforms AND you can always ADD more slaves on the network, OK but where's the drawback, I know there must be at least one -- yes there are plenty, but it basically depends on the developer(s):
- the system can take anywhere between a few seconds to a few minutes(depending on the initialization implementation -- needs to be ran at the beginning of the program execution) -- this can be tunned!!
- you will have to take care of the synchronization -- it's normal in a multithreaded environment
- if master dies the whole program progress can be lost -- this depends entirely on the implementation of the "main executor" or Mr. X ;-)
- you also need to take into consideration each system's configuration -- depending on this you can execute small tasks on Pentium 3 systems and others on P4 or i3/5/7's
As you can see the most important piece of the puzzle is the developer's skills.
But sometimes you need tens of thousands of computers -- WHAT can you do then?!
We all know that there are hundreds of millions of computers out there that are used only for Internet browsing, multimedia download, how can we use that to our advantage?! well a lot of hackers and companies uses/d zombie computers by uploading torrent clients and or multimedia programs for users to freely download and use, but while a lot of computers spend hours a day just downloading, the CPU and a lot of memory is available to be freely used legally or illegally depending on the EULA they provided with the software.
Take Skype for example, it uses your CPU and bandwidth in order to provide you with "free" service:
4.1 Permission to utilize Your computer. In order to receive the benefits provided by the Skype Software, You hereby grant permission for the Skype Software to utilize the processor and bandwidth of Your computer for the limited purpose of facilitating the communication between Skype Software users.This is a legal way of using your system, however others are JUST using your system because you got some illegal software from a torrent or warez website and you can't really complain about this in court, if you know what I mean -- it's your full responsibility.
4.2 Protection of Your computer (resources). You understand that the Skype Software will use its commercially reasonable efforts to protect the privacy and integrity of Your computer resources and Your communication, however, You acknowledge and agree that Skype cannot give any warranties in this respect.
You hereby grant permission for the Skype Software to utilize the processor and bandwidth of Your computer for the limited purpose of facilitating the communication between Skype Software users.
As a Delphi/Pascal developer, what can you use in order to target as many platforms as you can and implement this? HELLO?!?! Freepascal and Lazarus is a good starting point and DO NOT forget that as a developer you should NOT be limited to a single programming language, you can also use C++ and/or Java as well if you implement your protocol flexible enough!!