[Raw Msg Headers][Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: BerkeleyDB via NFS



On Thu, Apr 01, 2004 at 02:10:17PM -0300, Carlos G Mendioroz wrote:
> I GUESS that the problem would occur when changing the page file (i.e. 
> reordering inside buckets), which needs a fairly large database to be an 
> issue, and insertion activity. (I don't know if BDB coalesces buckets on 
> deletion, and I may be saying something wrong :-)
> 
> As they say, my 0.02


  What ever the exact reason is, the btree lookup can fail in several
  ways in this case:  (It is not 'Concurrent Data Store; CDB')
    
     - Misread indices and not notice it
     - Misread indices and notice it resulting an error

  In the latter case the code could be amended to handle it in the
  router, but it isn't doing it now.  (Close and reopen the DB.)

  What could work, though, results from following observations:

    - Database creation from randomly ordered data is slow
    - Btree creation in strictly ascending order is magnitudes
      faster, than random order
    - "dblookup -dump btree path-to.db" dumps data in that
      strictly ascending order, however it does not have
      support to handle CDB mode to be able to safely dump
      a CDB edited database in guaranteed consistent manner.
    - Writing a wee bit of C or Perl code to open source
      db in CDB mode, and temporary destination in simple
      mode is trivialish.  So is iterating thru the source
      db, and storing it into the destination db.
    - Doing final "mv TEMPDEST.db DESTINATION.db" completes
      the chain of tasks letting internal systems to auto-
      detect change in the underlying database.

  Oh yes, the boilerplated  $MAILVAR/db/dbases.conf  lists
  most produced router databases with 'm' = "checkmtime"
  flag.  That means that each lookup plays paranoid with
  database access, and if there is any change (like mtime
  is different, or i-node number of the underlying file
  has changed, or ...) then the database is closed, and
  re-opened.

  Which to a large extent may mean that the db driver is
  usually lucky and detects your ongoing changes.

  Copying  a milloin entry btree into fresh file will
  produce maximum density database, and be quite fast.
  Say: It might take a minute.  If you copy the actual
  db once every 10-15 minutes into production dataset,
  what would your customers thing about it ?

  E.g. can provisioning lag behind that 10-15 minutes ?

 
> Mariano Absatz wrote:
> 
> > OK... maybe I'm too stubborn, but anyway, we did a small test running one 
> > daemon randomly inserting, modifying and deleting records in a BerkeleyDB 
> > 4.2 hash database, while about 90 other processes were randomly reading 
> > records...
> > 
> > We got 6 search errors in about 1,000,000 reads... I could live with 
> > that... 
> > 
> > Does anyone have personal (or 3rd. party but actually known) case 
> > experience doing this and not working? or is it just a feelin'?
> > 
> > TIA
> > 
> > El 19 Mar 2004 a las 3:41, Matti Aarnio escribió:
> > 
> > 
> >>On Thu, Mar 18, 2004 at 05:48:36PM -0300, Mariano Absatz wrote:
> >>
> >>>Hi,
> >>>
> >>>I think I asked this before here and the only answer came from Eugene, 
> >>>telling me to use cfengine.
> >>>
> >>>However, I have a new case where cfengine will not be as effective...
> >>>
> >>>Has anyone experience using smtp-policy.db, userdb.db, routes.db and the 
> >>>like in a fast NFS mount?
> >>>
> >>>The databases would NEVER be modified from the mail servers, but from a 
> >>>central server (that has them also NFS mounted), and most of them will 
> >>>actually be dinamically modified (not via 'newdb').
> >>
> >>  Db-file recompile method (e.g. what  "zmailer newdb" does)  should be 
> >>  successfully autosensed over NFS, and handled in consistent manner.
> >>
> >>  What will NOT be successfull is concurrent-data-store access in
> >>  Sleepycat DB.  Shared-memory segments don't fly over NFS very
> >>  nicely...
> >>
> >>  What _might_ work is GNU GDBM shared writer access with its locking
> >>  schemes.  Things I faintly recall about it might allow successfull
> >>  shared access over NFS.  (fcntl locking, no shared memory things.)
> >>  Reading gdbm's documents:
> >>    Readers and writers can not open the `gdbm' database at the same time.
> >>  Oh, uh..   No can do.
> >>
> >>  What COULD work (even handsomely) is RPC mode of Sleepycat DB.
> >>  However it will require additional code in the router system ..
> >>  .. and isn't supportable in smtpserver without considerable
> >>  modifying of "PARAM policydb" processing.    Hmm..  introducing
> >>  new dbtype, perhaps "sleepyrpc",  and 'file' parameter would then
> >>  point to configuration file with necessary parameters ?
> >>  The   lib/sleepycatdb.c  function collection isn't quite generic
> >>  enough to be used at present anywhere, but inside the router.
> >>
> >>  I was entertaining an idea of replicating sleepycat db thru
> >>  its builtin replication support ("some" user code required!)
> >>  but that is rather over-complicated thing..
> > 
> > 
> > --
> > Mariano Absatz
> > El Baby
> -- 
> Carlos G Mendioroz  <tron@huapi.ba.ar>  LW7 EQI  Argentina

-- 
/Matti Aarnio	<mea@nic.funet.fi>
-
To unsubscribe from this list: send the line "unsubscribe zmailer" in
the body of a message to majordomo@nic.funet.fi