-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit number of jobs in TOSUBMIT state #88
Comments
Original comment by Giovanni Pizzi (Bitbucket: pizzi, GitHub: giovannipizzi): Issue #43 was marked as a duplicate of this issue. |
Original comment by Giovanni Pizzi (Bitbucket: pizzi, GitHub: giovannipizzi): Some ideas from #43:
|
Some Ideas: |
This is part of a bigger issue #995 |
This issue seems too generic. It also not clear to me if there is a problem if many jobs stay in the TOSUBMIT state |
This could be addressed in terms of the proposed |
I think issue #995 is different from this one. This one is about making the daemon aware of queue limits. And then somehow deal with it.
|
To elaborate: Currently I hack it this way: I limit the jobs with scheduler per machine to a hardcoded value. code snippet execmanager.submit_jobs_with_authinfo:
|
Also this solution can only account for user queue limits... But it at least allows for a 24/7 100% usage. And if anything fails, you can do a resubmit on the workchain level |
I was just rereading this and realize I do not understand why a scheduler has a job limit? What is the problem of submitting more jobs to the queue? At worst they will just end up in the queue and wait right. That is the whole point of the scheduler queue. It would make sense if it would be possible for the engine to, upon learning one scheduler's queue is full, to submit to another, or even another machine, but that will require changing the inputs of the calculation job, which is impossible. One needs to specify what machine and queue needs to be used upon calculation job launch time. @broeder-j am I missing something? |
I encounter a situation which may need to limit the number of jobs in scheduler. In a scheduler where no priority strategies are set, a newly submitted job has to wait until the previous queued jobs are all done.
@sphuber Is there a good way to deal with the problem? I notice that the number of submitted jobs is increase with daemon workers. In the aiida_core, is there an upper limit on the number of processing tasks which a worker can deal with? |
As of yet there is no built in solution for this but we are currently working on this. For now the best solution is to limit the number you submit yourself to the daemon. In your submission script you can put the submission in a while loop that before submitting checks how many active calculations there are currently and only submit more when needed |
Hi, my five cents: To control the limit by your job calculation/workchain submissions to AiiDA might be hard, since you may not necessary know how many calculations a workchain produces when. Also I am to lazy to wrap every project in 'flow control code' |
Also these limits might be quota dependent, i.e if you run out of quota the admins allow less jobs from you in the queue... running and queued. |
A comment on this old issue. I think we need first to reduce the memory footprint (see #4603, after re-doing the same tests in new new asyncio daemon), because anyway at the moment if you submit 1000+ calculations at the same time, they will all allocate a slot, and this will take quite some memory. I think that the logical steps are:
I think 1 and 2 need to be done anyway, to allow for higher throughput of processes if the compute node can cope with it. For 3, we need to think if it's instead easier to provide some functionality "external" to AiiDA, that helps keeping a max rate of submitted workflows (what I do in practice) - because in this way we don't have to go and touch the complex logic of the engine. I'm thinking to something that helps checking if you can submit, otherwise sleeps for some time before submitting again. The check would be e.g. the current number of running jobs in AiiDA, the current number of running jobs in a given group, or some other simple recipe. |
Hi all, I've been using AiiDA recently to develop batch computation programs, and I must say, AiiDA is indeed an incredibly useful framework. It leverages the features of the Python language effectively, making high-throughput calculations remarkably straightforward. I wanted to share my thoughts on this particular issue.
Regarding @sphuber's concern, I'd like to offer my understanding. I believe limiting the number of job submissions on computational nodes is a necessary feature. Imagine needing to compute certain properties for tens of thousands of materials using software like VASP, abinit, aims, etc. I initiate a "massive" computation task using AiiDA services in a Docker container, employing To prevent the latter scenario, we need a mechanism to throttle job submissions, such as halting AiiDA from submitting remote tasks to a machine once it detects a submission exceeding (2 * the queue capacity). I've noticed that in the recent 2.4.0/2.4.1 releases of In essence, users should be made aware that AiiDA is just a computational workflow framework, much like Django when establishing a website. Many context-dependent features require users to implement them within this framework. In my use case, I successfully implemented job submission blocking with the following approach, effectively resolving the stack overflow and "PBS under DDOS attack" issues: # import nest_asyncio # Opting for this package presents an alternative solution.
from plumpy import ProcessState
...
def calc_all_poscar(self):
for _ in range(self.inputs.total_struc_num):
# Barrier
while True:
qb = QueryBuilder()
qb.append(CalcJobNode, filters={ 'attributes.process_state': { 'in': [ ProcessState.CREATED.value, ProcessState.RUNNING.value, ProcessState.WAITING.value ] } })
busy_calcjob_num = len(qb.all())
if (busy_calcjob_num < self.inputs.conf_max_num_nodes.value):
break
time.sleep(60)
# Require and calculate the POSCAR
poscar = self._get_next_poscar()
inputs = self.exposed_inputs(VaspAllChain)
outputs = self.submit(VaspAllChain, poscar=poscar, **inputs)
self.to_context(respond=append_(outputs))
... Where |
Thanks for your detailed comment @kYangLi , much appreciated.
First a quick correction: although you are right that until v2.4 there was a bug in AiiDA that could cause a That being said, I do agree with your assessment that there still is another problem when launching many processes. Both AiiDA and the job schedulers to which they may get send, may get overloaded. This is a problem that has faced many users before, myself included. A common work around was to control submission with a wrapping script, which others have made into a reusable package (see the I think your solution is interesting and could be of use, but one should be careful because due to the Nevertheless, it is clear that this is a common problem and it would be great if AiiDA provided a built-in solution to limit the submission of processes. I have thought often about this, but unfortunately, due to the current architecture of AiiDA, implementing such a feature is not trivial. When we submit a process, we simply construct its corresponding node in the database and send a task to the process queue of RabbitMQ. RabbitMQ will send any new process tasks straight away to any daemon worker that is connected. There is no way to configure RabbitMQ to "throttle" how many tasks are sent out concurrently, so we cannot implement this throtthling on the RabbitMQ level. The alternative would be to have the throttling on the daemon worker. As soon as it gets a task to continue a process, it could check the current number of active processes and if it exceeds the number allowed, it could "refuse" the task and send it back. But RabbitMQ will immediately send it out again. You cannot tell RabbitMQ to "hold" the task for a while. So this approach would just lead to a ping-pong effect where RabbitMQ keeps sending out the task and the daemon worker keeps sending it back. The only thing I could think of so far, is to have another process queue that is a "parking" queue. When a daemon worker gets a task from the normal queue but realizes that the maximum number of processes is active, it could maybe send it to another queue to park it. But then there needs to be a mechanism that will move that task to the normal queue again, once some other processes have finished and there are available slots again. Suffice it to say, that this won't be an easy implementation and there will be plenty of room for bugs that will cause processes to get stuck in the parking queue. Anyway, maybe there are other solutions possible to integrate a solution in AiiDA. I would be very keen to hear your ideas because it would be a very valuable feature to have. |
Hi @sphuber, Thank you for your prompt response. First of all, I appreciate you pointing out the misconception I had earlier. In my previous experiments, I observed that submitting a large number of jobs inevitably led to the I understand the systemic difficulties in job control from the Nevertheless, AiiDA's mature database management, convenient command-line interaction, and robust management system still make it an excellent choice in the high-throughput materials computation field. I once wrote a simple high-throughput computing program in Previously, following the mindset of "AiiDA is all you need," I manually introduced a blocking mechanism within the Workchain. Honestly, this approach is only barely functional. Through extensive testing, I learn that making a step in the Workchain outline a background resident process may contradict the original design intent of Workchain outlines. For instance, restarting the Coincidentally, some of my recent ideas align with the implementation approach you provided for the In other words, the current features of aiida-core make it more suitable as a material calculator (small-scale scheduling) rather than a large-scale job scheduler. We feed it one material at a time, and it performs the corresponding calculation based on its internal task tree and machine node configuration, returning the desired results. Large-scale job scheduling tasks can be handled in a module named Apart from some compromises, this approach has several advantages:
Having said that, I revisited the implementation principles of |
Originally reported by: Nicolas Mounet (Bitbucket: mounet, GitHub: nmounet)
The execmanager should limit the number of jobs launched at the same time (to avoid explosion of the number of jobs in the queue). If too many jobs are already in the TOSUBMIT state, no more jobs should be submitted. The limit has to be defined in some property when one sets up the computer. Then "verdi calculation list" should clearly indicate the jobs not yet submitted, with something like "TOSUBMIT (Waiting)".
The text was updated successfully, but these errors were encountered: