Discussion:
[Abook-devel] [Fwd: [patch] import of ldif files]
Christian Brabandt
2012-12-13 07:56:24 UTC
Permalink
Hi this is a small patch to optimise the ldif importer.
It doesn't really make sense, to run through the complete list of ldif
fields, once we have added one item successfully.
Please check:


git diff filter.c
diff --git a/filter.c b/filter.c
index 13a0234..ad58d5f 100644
--- a/filter.c
+++ b/filter.c
@@ -640,6 +640,7 @@ ldif_convert(ldif_item item, char *type, char *value)
free(item_fget(item, i));

item_fput(item, i, xstrdup(value));
+ break;
}
}
}



regards,
Christian
--
weapon, n.:
An index of the lack of development of a culture.
Raphaël
2012-12-13 17:58:09 UTC
Permalink
Post by Christian Brabandt
Hi this is a small patch to optimise the ldif importer.
It doesn't really make sense, to run through the complete list of ldif
fields, once we have added one item successfully.
pushed. thanks!

You may want to also test the 4 following commits, respectively:
* cleanup and fix a segfault in LDIF import
* allow the LDIF filter to parse from stdin
* keep the LDIF export from dumping NULL email field
* don't reject LDIF records not having a "xmozillanickname"


regards
Christian Brabandt
2012-12-13 19:27:33 UTC
Permalink
Hi Raphaël!
Post by Raphaël
* don't reject LDIF records not having a "xmozillanickname"
<snip>
--- a/filter.c
+++ b/filter.c
@@ -609,10 +609,6 @@ ldif_add_item(ldif_item li)

item = item_create();

- if(!li[LDIF_ITEM_FIELDS -1])
- goto bail_out;
-
-

I am not sure this is actually correct. I think, this will always be
field objetclass which I think, always needs to be present.

So this is actualy useful, to discard invalid ldif files.

regards,
Christian
--
Lieber unheimlich drauf, als total unten durch.
Raphaël
2012-12-14 00:37:43 UTC
Permalink
Post by Christian Brabandt
Hi Raphaël!
<snip>
--- a/filter.c
+++ b/filter.c
@@ -609,10 +609,6 @@ ldif_add_item(ldif_item li)
item = item_create();
- if(!li[LDIF_ITEM_FIELDS -1])
- goto bail_out;
-
-
I am not sure this is actually correct. I think, this will always be
field objetclass which I think, always needs to be present.
So this is actualy useful, to discard invalid ldif files.
You're partly right. After a deeper look there's definitely a
bug: this test *could* have been useful but *actually* wasn't.

The `li` variable contains the item built from what has just been
parsed previously, before it gets copied into a real item stored into
the database.

But no use is done from the objectclass fields, not even in ldif_add_item().
As you can see, while xmozillanickname is mapped to NICK, objectclass is
*not* mapped to anything.
Post by Christian Brabandt
if(field == "objectclass" && value == "person") just-ignore (break);
But the fact is that if the value were "top" (as exported by
abook's ldif), nothing more would happen as this field is not mapped.

That's also why I reduced the for-loop to LDIF_ITEM_FIELDS - 1 in
cd6241eed. Agreed it's a bit rough.


Where really is the issue ?
Post by Christian Brabandt
objectclass: top
* item_fput(item, 15, xstrdup(value));
* where field_id(i) = 16 // defining the index used by item_fput()
* field_id() assertion should have failed, but it tests ITEM_FIELDS
instead of LDIF_ITEM_FIELDS
Then, specifically before cd6241eed:
* if !item[LDIF_ITEM_FIELDS - 1]: bail_out, the test is done on item[15]
which *is* "xmozillanickname".

That's why in the practice, LDIF entries without xmozillanickname were
rejected and the commit intended to fix that bug.
But I agree that testing objectclass presence is necessary to avoid
buggy LDIF records.
Post by Christian Brabandt
extern abook_field standard_fields[];
for(i=0; i < LDIF_ITEM_FIELDS; i++)
printf("i=%d, field_id(i)=%d, standard_fields[i]=%s, ldif_field_names[i]=%s\n",
i, field_id(i), standard_fields[i].key, ldif_field_names[i]);
i=0, field_id(i)=0, standard_fields[i]=name, ldif_field_names[i]=cn
i=1, field_id(i)=1, standard_fields[i]=email, ldif_field_names[i]=mail
i=2, field_id(i)=3, standard_fields[i]=address, ldif_field_names[i]=streetaddress
i=3, field_id(i)=4, standard_fields[i]=address2, ldif_field_names[i]=streetaddress2
i=4, field_id(i)=5, standard_fields[i]=city, ldif_field_names[i]=locality
i=5, field_id(i)=6, standard_fields[i]=state, ldif_field_names[i]=st
i=6, field_id(i)=7, standard_fields[i]=zip, ldif_field_names[i]=postalcode
i=7, field_id(i)=8, standard_fields[i]=country, ldif_field_names[i]=countryname
i=8, field_id(i)=9, standard_fields[i]=phone, ldif_field_names[i]=homephone
// here come troubles
i=9, field_id(i)=10, standard_fields[i]=workphone, ldif_field_names[i]=description
i=10, field_id(i)=11, standard_fields[i]=fax, ldif_field_names[i]=homeurl
i=11, field_id(i)=12, standard_fields[i]=mobile, ldif_field_names[i]=facsimiletelephonenumber
i=12, field_id(i)=13, standard_fields[i]=nick, ldif_field_names[i]=cellphone
i=13, field_id(i)=14, standard_fields[i]=url, ldif_field_names[i]=xmozillaanyphone
i=14, field_id(i)=15, standard_fields[i]=notes, ldif_field_names[i]=xmozillanickname
i=15, field_id(i)=16, standard_fields[i]=anniversary, ldif_field_names[i]=objectclass

For LDIF fields higher than 8, field_id() isn't suited and brings
inconsistencies.

I'll try to provide a fix soon.
Probably increasing LDIF_ITEM_FIELDS in order to safely store
objectclass so we can test invalid LDIF records.
Although adapting, copying or wrapping item_fput() to work consistently
with LDIF will need more thoughts.
In the best case reordering `ldif_field_names` and `ldif_conv_table` in
respect to `field_types` could be enough to do the trick.


anyway thank you to bring this double-check.
Raphaël
2012-12-19 12:46:39 UTC
Permalink
I pushed 3 new commits about ldif.
By date desc:
* don't force output to latin1 anymore
* ldif export all available emails
* ldif import reworked (2f827e0e)

2f827e0e is a big commit which intends to fix all the issues related to
fields names mappings, while paving the way for a more adaptive ldif importer.
if( !li[LDIF_OBJECTCLASS] ) goto bail_out;
Custom fields are not yet imported (nor exported) but I hope the new
code-base may support that (more) easily.


enjoy
Hi Raphaël!
Post by Raphaël
* don't reject LDIF records not having a "xmozillanickname"
<snip>
--- a/filter.c
+++ b/filter.c
@@ -609,10 +609,6 @@ ldif_add_item(ldif_item li)
item = item_create();
- if(!li[LDIF_ITEM_FIELDS -1])
- goto bail_out;
-
-
I am not sure this is actually correct. I think, this will always be
field objetclass which I think, always needs to be present.
So this is actualy useful, to discard invalid ldif files.
regards,
Christian
Loading...