Set up a cluster of Arch Linux machines.
Set up a cluster of Arch Linux machines.
In short, you want a Python solution instead of using it directly in Python.
It simply executes operations. It functions like running a script across multiple machines. Irrespective of what the tasks are or their origin, Slurm doesn't understand parallel execution. The program or job being run must support parallel processing. Each task should have its own method for communicating with other machines.
You should consider several aspects when evaluating your approach: 1. Can the issue you're addressing be broken into smaller parts? Not all challenges fit well with distributed methods or parallel processing. As mentioned, you could adjust your crawler so each node handles a portion of the search space (for example, x out of total URLs). 2. Is breaking it down efficient enough? For effective distribution, each node should perform mostly independent tasks with little need for coordination. Excessive communication reduces efficiency. Also, ensure nodes aren't restricted by shared resources like bandwidth. If network speed is a bottleneck, performance will suffer. On the other hand, if CPU usage dominates, distribution works better. 3. Which method suits your needs best? You must decide how much interaction is necessary between nodes and find the optimal way to manage it. In some scenarios, shared databases or locks can prevent duplicates. If not, direct node communication or a central coordinator might be more effective. 4. Is language X appropriate? My familiarity with Python is limited, so I can't confidently suggest it based on this. The GIL may hinder threading in Python, making other languages potentially more suitable depending on your goals. These points aren’t directly solving your problem but guide your next steps.
@Franck mentioned similar ideas about using another programming language such as C#, C++, or Java for this project. He also suggested that JavaScript might be a good fit since it would integrate more smoothly with the web front end I'm developing.