[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

trace of scheduler core dumps



The scheduler core dumping problem I reported earlier might have
something to do with how the scheduler handles a high number of
mail/threads/child processes (?) (sorry, I haven't caught up with how
the scheduler works now).  I cut down the number of messages from 179
to about 19 and the scheduler has not died on me yet.

I'm including a trace of the scheduler process.  Can someone provide me
with some insight as to what might be happening?  I'm running 2.99.38
on a Sun Ultra under Solaris 2.5.1.  There are 179 messages in the
transport directory, all of them are supposed to smtp to
mac.its.hawaii.edu which should be down so I can watch the scheduler
do its work.    I am using the default scheduler.conf file.

If someone can confirm it's a bug and not something of my doing, then
I can stop worrying about it.  Please reply to julio@hawaii.edu as I'm
not in this mailing list.  Thanks.

-Julio Polo, University of Hawaii

------------------------------------------------------------------------------

Here is a trace of the scheduler when it starts up and tries to process
the 179 messages in queue/transport left by the router.


ROOT@util3:/usr/uh/spkg/zmailer29938/bin# ./scheduler -iv
ZMailer scheduler (2.99.38 #1: Tue Oct 22 15:54:30 HST 1996)
  root@util3:/usr/uh/spkg/zmailer-2.99.38-src/scheduler
Copyright 1992 Rayan S. Zachariassen
Copyright 1992-1996 Matti Aarnio

(loading scheduler.conf stuff deleted)

dirqueuescan(dir='.') wrksum=178 new=178
vtxprep: msg 307699-29159 rcptns total 1 work 0 failed 0 done 1
Matched 11th config entry with: smtp/mac.its.hawaii.edu
thread_linkin([smtp/mac.its.hawaii.edu],smtp/0/*.edu,11)
create_thread(smtp/0/mac.its.hawaii.edu) -> 0x4f018
thread_linkin() to thg=0x4ef78[smtp/0/*.edu]; created a new thread 0x4f018 [smtp/mac.its.hawaii.edu]
thread_start(thr=smtp/0/mac.its.hawaii.edu) (dt=-846212329, thr=0x4f018)

transport(vhead,chan=smtp,host=mac.its.hawaii.edu)
$ /usr/uh/spkg/zmailer29938/bin/ta/smtp -s
to 5/7 from 7/5
stashprocess(29272, 7, 7, smtp, mac.its.hawaii.edu)
dirqueuescan(dir='.') wrksum=177 new=0
idle_cleanup()
vtxprep: msg 307705-29158 rcptns total 1 work 0 failed 0 done 1
Matched 11th config entry with: smtp/mac.its.hawaii.edu
thread_linkin([smtp/mac.its.hawaii.edu],smtp/0/*.edu,11)
thread_linkin() to thg=0x4ef78[smtp/0/*.edu]; added into existing thread [smtp/mac.its.hawaii.edu] thr->jobs=2
thread_start(thr=smtp/0/mac.its.hawaii.edu) (dt=-846212329, thr=0x4f018)
 -- already running
vtxprep: msg 307706-29160 rcptns total 1 work 0 failed 0 done 1
Matched 11th config entry with: smtp/mac.its.hawaii.edu
thread_linkin([smtp/mac.its.hawaii.edu],smtp/0/*.edu,11)
thread_linkin() to thg=0x4ef78[smtp/0/*.edu]; added into existing thread [smtp/mac.its.hawaii.edu] thr->jobs=3
thread_start(thr=smtp/0/mac.its.hawaii.edu) (dt=-846212329, thr=0x4f018)

(thr->jobs=4,5,6 etc. etc. deleted)

vtxprep: msg 307718-29159 rcptns total 34 work 0 failed 0 done 34
Matched 11th config entry with: smtp/mac.its.hawaii.edu
thread_linkin([smtp/mac.its.hawaii.edu],smtp/0/*.edu,11)
thread_linkin() to thg=0x4ef78[smtp/0/*.edu]; added into existing thread [smtp/mac.its.hawaii.edu] thr->jobs=22
thread_start(thr=smtp/0/mac.its.hawaii.edu) (dt=-846212329, thr=0x4f018)
 -- already running
curitem 4f018 curitem->wakeup 0 now 846212329
read from 7 returns 8
29272 fd=7 processed: #hungry
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x4f110, jobs=22 ok=1 justfree=0)
 ... NONE, current one has not been fed..
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x4f110, len=32 buf=307699-29159	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x4f110, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x159b20
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x159b20, len=32 buf=307705-29158	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x159b20, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x4f7e8
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x4f7e8, len=32 buf=307706-29160	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x4f7e8, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x44f00
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x44f00, len=32 buf=307710-29158	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x44f00, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x159f60
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x159f60, len=32 buf=307711-29158	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x159f60, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x44d48
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x44d48, len=32 buf=307712-29160	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x44d48, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x159d80
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x159d80, len=32 buf=307713-29158	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x159d80, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x15a348
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x15a348, len=32 buf=307714-29160	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x15a348, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x159ff0
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x159ff0, len=32 buf=307701-29161	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x159ff0, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x15a100
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x15a100, len=32 buf=307709-29159	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x15a100, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x15a1f0
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x15a1f0, len=32 buf=307716-29160	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x15a1f0, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x15ad50
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x15ad50, len=32 buf=307715-29158	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x15ad50, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x15afd8
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x15afd8, len=32 buf=307717-29161	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x15afd8, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x15a740
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x15a740, len=32 buf=307720-29160	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x15a740, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x15aec0
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x15aec0, len=32 buf=307721-29160	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x15aec0, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x15aba8
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x15aba8, len=32 buf=307722-29158	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x15aba8, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x15a9b0
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x15a9b0, len=32 buf=307724-29158	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x15a9b0, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x15a3d8
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x15a3d8, len=32 buf=307719-29161	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x15a3d8, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x15a800
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x15a800, len=32 buf=307725-29158	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x15a800, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x15b6b8
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x15b6b8, len=32 buf=307723-29160	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x15b6b8, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x15a508
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x15a508, len=32 buf=307726-29161	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x15a508, jobs=22 ok=1 justfree=0)
 ... thr=same vtx=0x15b1a8
feed: tofd=7, fed=0, chan=smtp, proc=0x517d0, vtx=0x15b1a8, len=32 buf=307718-29159	mac.its.hawaii.edu
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x15b1a8, jobs=22 ok=1 justfree=0)
 ... overfed=22, no thread change.
read from 7 returns 8, errno=0
vtxprep: msg 307728-29160 rcptns total 1 work 0 failed 0 done 1
Matched 11th config entry with: smtp/mac.its.hawaii.edu
thread_linkin([smtp/mac.its.hawaii.edu],smtp/0/*.edu,11)
thread_linkin() to thg=0x4ef78[smtp/0/*.edu]; added into existing thread [smtp/mac.its.hawaii.edu] thr->jobs=23
thread_start(thr=smtp/0/mac.its.hawaii.edu) (dt=-846212329, thr=0x4f018)
 -- already running
.
(etc. etc. etc.)
.
curitem 4f018 curitem->wakeup 0 now 846212330
vtxprep: msg 307883-29161 rcptns total 25 work 0 failed 0 done 25
Matched 11th config entry with: smtp/mac.its.hawaii.edu
thread_linkin([smtp/mac.its.hawaii.edu],smtp/0/*.edu,11)
thread_linkin() to thg=0x4ef78[smtp/0/*.edu]; added into existing thread [smtp/mac.its.hawaii.edu] thr->jobs=176
thread_start(thr=smtp/0/mac.its.hawaii.edu) (dt=-846212330, thr=0x4f018)
 -- already running
curitem 4f018 curitem->wakeup 0 now 846212330
vtxprep: msg 307753-29160 rcptns total 1 work 0 failed 0 done 1
Matched 2th config entry with: hold/ns:henson.cc.www.edu/any
thread_linkin([hold/ns:henson.cc.www.edu/any],hold/0/*,2)
create_thread(hold/0/ns:henson.cc.www.edu/any) -> 0x168cb0
thread_linkin() to thg=0x168c10[hold/0/*]; created a new thread group, and thread [hold/ns:henson.cc.www.edu/any]
thread_start(thr=hold/0/ns:henson.cc.www.edu/any) (dt=-846212330, thr=0x168cb0)

transport(vhead,chan=hold,host=ns:henson.cc.www.edu/any)
$ /usr/uh/spkg/zmailer29938/bin/ta/hold
to 5/8 from 8/5
stashprocess(29273, 8, 8, hold, ns:henson.cc.www.edu/any)
curitem 4f018 curitem->wakeup 0 now 846212330
vtxprep: msg 307882-29158 rcptns total 25 work 0 failed 0 done 25
Matched 11th config entry with: smtp/mac.its.hawaii.edu
thread_linkin([smtp/mac.its.hawaii.edu],smtp/0/*.edu,11)
thread_linkin() to thg=0x4ef78[smtp/0/*.edu]; added into existing thread [smtp/mac.its.hawaii.edu] thr->jobs=177
thread_start(thr=smtp/0/mac.its.hawaii.edu) (dt=-846212330, thr=0x4f018)
 -- already running
curitem 4f018 curitem->wakeup 0 now 846212330
read from 8 returns 8
29273 fd=8 processed: #hungry
pick_next_vertex(proc->tofd=8, thr=0x168cb0, vtx=0x168da8, jobs=1 ok=1 justfree=0)
 ... NONE, current one has not been fed..
feed: tofd=8, fed=0, chan=hold, proc=0x51bf8, vtx=0x168da8, len=38 buf=307753-29160	ns:henson.cc.www.edu/any
pick_next_vertex(proc->tofd=8, thr=0x168cb0, vtx=0x168da8, jobs=1 ok=1 justfree=0)
 ... overfed=1, no thread change.
read from 8 returns 8, errno=0

sig_chld() pid=29295, ok=1, stat=0xff00
dirqueuescan(dir='.') wrksum=0 new=0
idle_cleanup()
curitem 4f018 curitem->wakeup 0 now 846212389
dirqueuescan(dir='.') wrksum=0 new=0
curitem 4f018 curitem->wakeup 0 now 846212399
dirqueuescan(dir='.') wrksum=0 new=0
idle_cleanup()
curitem 4f018 curitem->wakeup 0 now 846212409
dirqueuescan(dir='.') wrksum=0 new=0
curitem 4f018 curitem->wakeup 0 now 846212419
read from 8 returns 137
29273 fd=8 processed: 307753/170	try againdelayed4.4.3 (Temporary routing lookup failure)466 (Temporary routing lookup failure)	deferred try again
diagnostic: 307753/170	try againdelayed4.4.3 (Temporary routing lookup failure)466 (Temporary routing lookup failure)	deferred
reschedule 168da8 now 846212426 expiry in 61745 attempts 1 factor -1 inum 307753 (hold/ns:henson.cc.www.edu/any: 307753-29160)
wakeup 846212700 pending 0
ap 0 pap 0 curitem 0
29273 fd=8 processed: #hungry
pick_next_vertex(proc->tofd=8, thr=0x168cb0, vtx=0x168da8, jobs=1 ok=1 justfree=0)
 ... NONE, idle the process (of=0, f=1).
read from 8 returns 137, errno=0
read from 8 returns 8
29273 fd=8 processed: #hungry
pick_next_vertex(proc->tofd=8, thr=0x0, vtx=0x0, jobs=0 ok=1 justfree=0)
 ... NONE, we are idle.
read from 8 returns 8, errno=0
dirqueuescan(dir='.') wrksum=0 new=0
idle_cleanup()
curitem 4f018 curitem->wakeup 0 now 846212429
dirqueuescan(dir='.') wrksum=0 new=0
curitem 4f018 curitem->wakeup 0 now 846212439
dirqueuescan(dir='.') wrksum=0 new=0
idle_cleanup()
curitem 4f018 curitem->wakeup 0 now 846212449
dirqueuescan(dir='.') wrksum=0 new=0
curitem 4f018 curitem->wakeup 0 now 846212459
dirqueuescan(dir='.') wrksum=0 new=0
idle_cleanup()
curitem 4f018 curitem->wakeup 0 now 846212469
dirqueuescan(dir='.') wrksum=0 new=0
curitem 4f018 curitem->wakeup 0 now 846212479
dirqueuescan(dir='.') wrksum=0 new=0
idle_cleanup()
curitem 4f018 curitem->wakeup 0 now 846212489
dirqueuescan(dir='.') wrksum=0 new=0
curitem 4f018 curitem->wakeup 0 now 846212499
dirqueuescan(dir='.') wrksum=0 new=0
idle_cleanup()
curitem 4f018 curitem->wakeup 0 now 846212509
sig_chld() pid=29303, ok=1, stat=0xff00
dirqueuescan(dir='.') wrksum=0 new=0
curitem 4f018 curitem->wakeup 0 now 846212519
dirqueuescan(dir='.') wrksum=0 new=0
idle_cleanup()
curitem 4f018 curitem->wakeup 0 now 846212529
read from 7 returns 277
29272 fd=7 processed: 307699/144	rabsf-l@Hawaii.Edufailed5.4.1 (TCP/IP-connection failure)smtp; 500 (connect to mac.its.hawaii.edu [128.171.201.40]: Connection timed out)dns; mac.its.hawaii.edu	retryat +60 smtp; 500 (connect to mac.its.hawaii.edu [128
.171.201.40]: Connection timed out)
diagnostic: 307699/144	rabsf-l@Hawaii.Edufailed5.4.1 (TCP/IP-connection failure)smtp; 500 (connect to mac.its.hawaii.edu [128.171.201.40]: Connection timed out)dns; mac.its.hawaii.edu	retryat
thread_reschedule() ch=smtp ho=mac.its.hawaii.edu jobs=177
29272 fd=7 processed: #hungry
... child pid 29272 overfed=21
read from 7 returns 277, errno=0
read from 7 returns 8
29272 fd=7 processed: #hungry
... child pid 29272 overfed=20
read from 7 returns 8, errno=0
read from 7 returns 8
29272 fd=7 processed: #hungry
... child pid 29272 overfed=19
read from 7 returns 8, errno=0
read from 7 returns 8
29272 fd=7 processed: #hungry
... child pid 29272 overfed=18
read from 7 returns 8, errno=0
read from 7 returns 8
29272 fd=7 processed: #hungry
... child pid 29272 overfed=17
read from 7 returns 8, errno=0
dirqueuescan(dir='.') wrksum=0 new=0
curitem 4f018 curitem->wakeup 846212590 now 846212539
read from 7 returns 8
29272 fd=7 processed: #hungry
... child pid 29272 overfed=16
read from 7 returns 8, errno=0
read from 7 returns 8
29272 fd=7 processed: #hungry
... child pid 29272 overfed=15
read from 7 returns 8, errno=0
read from 7 returns 8
29272 fd=7 processed: #hungry
... child pid 29272 overfed=14
read from 7 returns 8, errno=0
read from 7 returns 8
29272 fd=7 processed: #hungry
... child pid 29272 overfed=13
read from 7 returns 8, errno=0
read from 7 returns 8
29272 fd=7 processed: #hungry
... child pid 29272 overfed=12
read from 7 returns 8, errno=0
dirqueuescan(dir='.') wrksum=0 new=0
idle_cleanup()
idle_cleanup() killing TA on tofd=8 pid=29273
curitem 4f018 curitem->wakeup 846212590 now 846212549
sig_chld() pid=29273, ok=0, stat=0x0
read from 8 returns 0, errno=0
reclaim(8,-1) pid=29273, reaped=0, chan=hold, host=ns:henson.cc.www.edu/any
read from 7 returns 8
29272 fd=7 processed: #hungry
... child pid 29272 overfed=11
read from 7 returns 8, errno=0
read from 7 returns 8
29272 fd=7 processed: #hungry
... child pid 29272 overfed=10
read from 7 returns 8, errno=0
read from 7 returns 8
29272 fd=7 processed: #hungry
... child pid 29272 overfed=9
read from 7 returns 8, errno=0
read from 7 returns 8
29272 fd=7 processed: #hungry
... child pid 29272 overfed=8
read from 7 returns 8, errno=0
read from 7 returns 8
29272 fd=7 processed: #hungry
... child pid 29272 overfed=7
read from 7 returns 8, errno=0
dirqueuescan(dir='.') wrksum=0 new=0
curitem 4f018 curitem->wakeup 846212590 now 846212559
read from 7 returns 8
29272 fd=7 processed: #hungry
... child pid 29272 overfed=6
read from 7 returns 8, errno=0
read from 7 returns 8
29272 fd=7 processed: #hungry
... child pid 29272 overfed=5
read from 7 returns 8, errno=0
read from 7 returns 8
29272 fd=7 processed: #hungry
... child pid 29272 overfed=4
read from 7 returns 8, errno=0
read from 7 returns 8
29272 fd=7 processed: #hungry
... child pid 29272 overfed=3
read from 7 returns 8, errno=0
read from 7 returns 8
29272 fd=7 processed: #hungry
... child pid 29272 overfed=2
read from 7 returns 8, errno=0
dirqueuescan(dir='.') wrksum=0 new=0
idle_cleanup()
curitem 4f018 curitem->wakeup 846212590 now 846212569
read from 7 returns 8
29272 fd=7 processed: #hungry
... child pid 29272 overfed=1
read from 7 returns 8, errno=0
read from 7 returns 8
29272 fd=7 processed: #hungry
pick_next_vertex(proc->tofd=7, thr=0x4f018, vtx=0x15b1a8, jobs=177 ok=1 justfree=0)
 ... NONE, idle the process (of=0, f=1).
read from 7 returns 8, errno=0
read from 7 returns 8
29272 fd=7 processed: #hungry
pick_next_vertex(proc->tofd=7, thr=0x0, vtx=0x0, jobs=0 ok=1 justfree=0)
 ... NONE, we are idle.
read from 7 returns 8, errno=0
dirqueuescan(dir='.') wrksum=0 new=0
curitem 4f018 curitem->wakeup 846212590 now 846212579
dirqueuescan(dir='.') wrksum=0 new=0
idle_cleanup()
curitem 4f018 curitem->wakeup 846212590 now 846212589
dirqueuescan(dir='.') wrksum=0 new=0
curitem 4f018 curitem->wakeup 846212590 now 846212599
thread_start(thr=smtp/0/mac.its.hawaii.edu) (dt=-9, thr=0x4f018)

Bus Error (core dumped)