[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: new-scheduler-2.99.19 problem!

> Hello,
> I started to test out the power of new scheduler 2.99.19 on my SunOS
> platform.  But problem comes out once I shoot out a thread to deliver
> to a site that can't accept the connection.  Naturally, the scheduler
> should try again on next schedule time.  However, no smtp transporter
> is created after time up  (I have set idlemax < 1st reschedule time
> and that why the previous born transporter has already passed away).
> From the mailq output, it shows weird thing that transporter with pid 0 
> is running but actually 'ps' shows no process is created by scheduler.

	Weird in deed.
	(Btw, 2.99.19 does have also "scheduler", which does work
	 in the old style.. i.e. sometimes better than "scheduler-new",
	 but usually causing more system load.)

	It is a symptom of transport process dying without
	removing itself from the active chains.

	I can re-create the phenomena too..
	(By sending to  smtp/*-gw.funet.fi which have 1 min timeout,
	 and always refuse connection...  kiwi-gw, passion-gw, ..)

	So far I have found one place where there happens a catastrophic
	reclaim(), and why it happens.. Sometimes the childs die, and
	when it happens, pid becomes negated as to flag a dead transporter.
	Still the fd can exist!  This was reported a place where I did
	fprintf(stderr) something about process not being in idle chain.
	Now that one is handled ok.

	I am yet to find why pid=0 can happen, because the thread
	restart happens only when the thread->proc is NULL.. i.e.
	there is no process record active.

	Maybe I get it latter today,  pick_next_vertex() is my
	current suspect..

	/Matti Aarnio <mea@nic.funet.fi>