[3/3] buildhistory: remove duplicate entries from dot graph

Submitted by Koen Kooi on March 20, 2012, 11 a.m.

Details

Message ID 1332241255-5083-3-git-send-email-koen@dominion.thruhere.net
State New
Headers show

Commit Message

Koen Kooi March 20, 2012, 11 a.m.
There are various conditions that lead to duplicate entries in the dot graph which need to get fixed, but this patch is a catchall.

Another benefit is that the sort order is now know, leading to less spurious diffs in buildhistory commits.

Signed-off-by: Koen Kooi <koen@dominion.thruhere.net>
---
 meta/classes/buildhistory.bbclass |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

Patch hide | download patch | download mbox

diff --git a/meta/classes/buildhistory.bbclass b/meta/classes/buildhistory.bbclass
index 3aec325..6502b04 100644
--- a/meta/classes/buildhistory.bbclass
+++ b/meta/classes/buildhistory.bbclass
@@ -291,6 +291,7 @@  buildhistory_get_image_installed() {
 	# Produce installed package file and size lists and dependency graph
 	echo -n > ${BUILDHISTORY_DIR_IMAGE}/installed-packages.txt
 	echo -n > ${BUILDHISTORY_DIR_IMAGE}/installed-package-sizes.tmp
+	echo -n > ${BUILDHISTORY_DIR_IMAGE}/depends.tmp	
 	echo -e "digraph depends {\n    node [shape=plaintext]" > ${BUILDHISTORY_DIR_IMAGE}/depends.dot
 	for pkg in $INSTALLED_PKGS; do
 		pkgfile=`get_package_filename $pkg`
@@ -302,14 +303,16 @@  buildhistory_get_image_installed() {
 
 		deps=`list_package_depends $pkg | sort | uniq`
 		for dep in $deps ; do
-			echo "$pkg OPP $dep;" | sed -e 's:-:_:g' -e 's:\.:_:g' -e 's:+::g' | sed 's:OPP:->:g' >> ${BUILDHISTORY_DIR_IMAGE}/depends.dot
+			echo "$pkg OPP $dep;" | sed -e 's:-:_:g' -e 's:\.:_:g' -e 's:+::g' | sed 's:OPP:->:g' >> ${BUILDHISTORY_DIR_IMAGE}/depends.tmp
 		done
 
 		recs=`list_package_recommends $pkg | sort | uniq`
 		for rec in $recs ; do
-			echo "$pkg OPP $rec [style=dotted];" | sed -e 's:-:_:g' -e 's:\.:_:g' -e 's:+::g' | sed 's:OPP:->:g' >> ${BUILDHISTORY_DIR_IMAGE}/depends.dot
+			echo "$pkg OPP $rec [style=dotted];" | sed -e 's:-:_:g' -e 's:\.:_:g' -e 's:+::g' | sed 's:OPP:->:g' >> ${BUILDHISTORY_DIR_IMAGE}/depends.tmp
 		done
 	done
+	cat ${BUILDHISTORY_DIR_IMAGE}/depends.tmp | sort | uniq >> ${BUILDHISTORY_DIR_IMAGE}/depends.dot
+	rm ${BUILDHISTORY_DIR_IMAGE}/depends.tmp
 	echo "}" >>  ${BUILDHISTORY_DIR_IMAGE}/depends.dot
 
 	cat ${BUILDHISTORY_DIR_IMAGE}/installed-package-sizes.tmp | sort -n -r | awk '{print $1 "\tKiB " $2}' > ${BUILDHISTORY_DIR_IMAGE}/installed-package-sizes.txt

Comments

Paul Eggleton March 20, 2012, 11:48 a.m.
On Tuesday 20 March 2012 12:00:55 Koen Kooi wrote:
> There are various conditions that lead to duplicate entries in the dot graph
> which need to get fixed, but this patch is a catchall.

Dependencies and recommends are already de-duped - are we getting duplicates 
in the package list as well?

> Another benefit is that the sort order is now know, leading to less spurious
> diffs in buildhistory commits.

The package list is already sorted, and so are the dependencies and 
recommends, so how would the sort order be changing?

Cheers,
Paul
Koen Kooi March 20, 2012, 12:11 p.m.
Op 20 mrt. 2012, om 12:48 heeft Paul Eggleton het volgende geschreven:

> On Tuesday 20 March 2012 12:00:55 Koen Kooi wrote:
>> There are various conditions that lead to duplicate entries in the dot graph
>> which need to get fixed, but this patch is a catchall.
> 
> Dependencies and recommends are already de-duped - are we getting duplicates 
> in the package list as well?

From what I'm seeing, only dupes in the dot files:

https://github.com/Angstrom-distribution/buildhistory/commit/b2a256e02e04349db51e663e35b2bb9989acc854#diff-1

>> Another benefit is that the sort order is now know, leading to less spurious
>> diffs in buildhistory commits.
> 
> The package list is already sorted, and so are the dependencies and 
> recommends, so how would the sort order be changing?

Those fields are not sorted in the package itself[1], re-running bitbake systemd-image after these changes:

 angstrom_feed_configs -> opkg [style=dotted];
-angstrom_task_boot -> task_boot;
 angstrom_task_boot -> angstrom_version;
+angstrom_task_boot -> task_boot;
[..]
-htop -> ncurses_terminfo;
-htop -> libtinfo5;
 htop -> libncurses5;
-iw -> libnl_genl_3_200;
+htop -> libtinfo5;
+htop -> ncurses_terminfo;
 iw -> libnl_3_200;

I think I know why [2] doesn't work:

koen@dominion:/data/ssd/OE/buildhistory$ echo b c a | sort
b c a
koen@dominion:/data/ssd/OE/buildhistory$ echo -e b\\nc\\na | sort
a
b
c

Let me know how you want me to respin this series, each commit is an RFC on its own.

regards,

Koen

[1] https://github.com/openembedded/oe-core/commit/06b740d4ca077fb4c89ee6d1065fabb02da45ec6
[2] https://github.com/openembedded/oe-core/commit/830df6067c1ea4a5aab580b42ba7e1e84fe1bcbf
Paul Eggleton March 20, 2012, 1:14 p.m.
On Tuesday 20 March 2012 13:11:38 Koen Kooi wrote:
> I think I know why [2] doesn't work:
> 
> koen@dominion:/data/ssd/OE/buildhistory$ echo b c a | sort
> b c a
> koen@dominion:/data/ssd/OE/buildhistory$ echo -e b\\nc\\na | sort
> a
> b
> c

Ah, yes, that would explain it. I've primarily been testing with rpm for which 
list_package_depends produces the output on separate lines, whereas the ipk 
version is all on one line; it only matters when doing the sort. Oops!

> Let me know how you want me to respin this series, each commit is an RFC on
> its own.

If you wouldn't mind I think the easiest thing is to drop the sort/uniq on the 
depends/recommends and just rely on the one at the end that you're adding. It 
does mean we'll have a one-time resort of the recommends and depends together 
but at least it will work properly for all package backends.

Thanks,
Paul
Koen Kooi March 20, 2012, 1:40 p.m.
Op 20 mrt. 2012, om 14:14 heeft Paul Eggleton het volgende geschreven:

> On Tuesday 20 March 2012 13:11:38 Koen Kooi wrote:
>> I think I know why [2] doesn't work:
>> 
>> koen@dominion:/data/ssd/OE/buildhistory$ echo b c a | sort
>> b c a
>> koen@dominion:/data/ssd/OE/buildhistory$ echo -e b\\nc\\na | sort
>> a
>> b
>> c
> 
> Ah, yes, that would explain it. I've primarily been testing with rpm for which 
> list_package_depends produces the output on separate lines, whereas the ipk 
> version is all on one line; it only matters when doing the sort. Oops!
> 
>> Let me know how you want me to respin this series, each commit is an RFC on
>> its own.
> 
> If you wouldn't mind I think the easiest thing is to drop the sort/uniq on the 
> depends/recommends and just rely on the one at the end that you're adding

V2 sent. I'm not really happy with the interaction between patch 1 and 2, so I'd appreciate some feedback on it.

regards,

Koen