Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve dtdiff by avoiding or nulling the phandles #145

Open
jcormier opened this issue Aug 6, 2024 · 30 comments
Open

Improve dtdiff by avoiding or nulling the phandles #145

jcormier opened this issue Aug 6, 2024 · 30 comments

Comments

@jcormier
Copy link

jcormier commented Aug 6, 2024

The dtdiff tool is super useful when trying to compare two different device trees. However, it generally generates a ton of noise flagging all the phandles as changed between the two devices. This makes it hard to see the real changes.

How hard would it be to add a flag to the dtc compiler, maybe a phandle format that reports all phandles as 0x0?

@dragonsploder
Copy link

I agree, this would be very useful to have.

@dragonsploder
Copy link

Here is a temporary hack to print NULL for all phandles:

diff --git a/treesource.c b/treesource.c
index ae15839..8fb3bf4 100644
--- a/treesource.c
+++ b/treesource.c
@@ -204,6 +204,11 @@ static void write_propval(FILE *f, struct property *prop)
        enum markertype emit_type = TYPE_NONE;
        char *srcstr;
 
+       if (strcmp(prop->name, "phandle") == 0) {
+               fprintf(f, " = NULL;\n");
+               return;
+       }
+
        if (len == 0) {
                fprintf(f, ";");
                if (annotate) {

@dgibson
Copy link
Owner

dgibson commented Aug 7, 2024

If you just want to skip the actual phandle (and linux,phandle) properties, we could add an option for that and use it in dtdiff - we already have the sort (-s) option largely for the benefit of dtdiff. I won't have capacity to do this myself, but I'd by happy enough to review and merge patches for it.

If you wanted to skip phandles included in other properties, that would be much, much harder.

@jcormier
Copy link
Author

jcormier commented Aug 7, 2024

If you just want to skip the actual phandle (and linux,phandle) properties

Would you recommend handling that in the input side of things or the output? The above hack is doing it on the output. Just looking for a starting place to look.

If you wanted to skip phandles included in other properties, that would be much, much harder.

Yeah I mean that would be good as well but agreed that it would be more involved. Is it reasonable to try and handle both scenarios by replacing the handle hex value with the node name its pointing too?

Also, do you have any suggestions for what you would like the new argument to be? I mentioned the phandle format option but looking at the code I'm unclear what that's doing, or if its even relevant. Would you prefer to add a generic --diff-format option that could be improved/expanded later. Or a more specific option (--skip-phandles) that describes what this change is doing.

@dgibson
Copy link
Owner

dgibson commented Aug 8, 2024

If you just want to skip the actual phandle (and linux,phandle) properties

Would you recommend handling that in the input side of things or the output? The above hack is doing it on the output. Just looking for a starting place to look.

Output definitely. That way all the internal processing of phandles remains unchanged.

If you wanted to skip phandles included in other properties, that would be much, much harder.

Yeah I mean that would be good as well but agreed that it would be more involved. Is it reasonable to try and handle both scenarios by replacing the handle hex value with the node name its pointing too?

Actually, that's a good point. If you're coming from dts input, the type markers we already have mean that references to phandles should already be shown as &label or whatever - at least if that's how they were given in the input. If you have dtb input, I don't think it's feasible to try to handle this case: you don't know where the phandle references are without detailed knowledge of the bindings which is out of scope for dtc.

Also, do you have any suggestions for what you would like the new argument to be? I mentioned the phandle format option but looking at the code I'm unclear what that's doing, or if its even relevant. Would you prefer to add a generic --diff-format option that could be improved/expanded later. Or a more specific option (--skip-phandles) that describes what this change is doing.

I'd suggest something specific, much like --sort. --omit-phandles would be the obvious choice to me.

@jcormier
Copy link
Author

jcormier commented Aug 8, 2024

If you have dtb input, I don't think it's feasible to try to handle this case: you don't know where the phandle references are without detailed knowledge of the bindings which is out of scope for dtc.

Ahh I didn't know about that. I just read through https://elinux.org/Device_Tree_Mysteries#Phandle which explains it quite well.

Thus the value of the property "interrupt-parent" in node uart@200 is merely a u32 value. It is not a pointer to the location of node /soc/pic@100 in the device tree blob. When loaded by the kernel, it is also not a pointer in the expanded device tree. It is merely a value in a property. The only way that kernel code knows that the value in the "interrupt-parent" property is a phandle is because the binding documentation has defined that the value is a phandle. Thus the code that reads the "interrupt-parent" property has to be written to treat that value as a phandle. When the code needs to find what node the phandle points to, it calls of_find_node_by_phandle(), which searches for a node containing a "phandle" property containing the specified value.

@jcormier
Copy link
Author

jcormier commented Aug 8, 2024

A simpler change would be something like the following, though it will add a dependency on sed or grep.

It brings up the question of do you want dtdiff to always drop phandles or is it okay to add argument handling to this script?

--- a/dtdiff
+++ b/dtdiff
@@ -28,7 +28,7 @@ source_and_sort () {
        exit 2
     fi
 
-    $DTC -I $IFORMAT -O dts -qq -f -s -o - "$DT"
+    $DTC -I $IFORMAT -O dts -qq -f -s -o - "$DT" | grep -v 'phandle ='
 }
 
 if [ $# != 2 ]; then

@dgibson
Copy link
Owner

dgibson commented Aug 9, 2024

A simpler change would be something like the following, though it will add a dependency on sed or grep.

Simpler, but not necessarily reliable. It will incorrectly omit lines that happen to have "phandle =" in them, even if it's a string in the property value.

@jcormier
Copy link
Author

jcormier commented Aug 9, 2024

Sure, I don't think "phandle =" is likely to show up in a property value but if you'd prefer the more thorough implementation, I'll look into it.

@dgibson
Copy link
Owner

dgibson commented Aug 11, 2024

Sure, I don't think "phandle =" is likely to show up in a property value but if you'd prefer the more thorough implementation, I'll look into it.

It's certainly not likely, but I'd rather not a hard to debug edge case as a landmine for the future.

@jcormier
Copy link
Author

Makes sense.

Separately I was thinking about trying to track phandles and thought up a possible method.

  • After decompile, go through each file output and create a phandle -> node path map.
  • Compare diff of the two files and only replace the value if the old and new value point to the same node path
  • Run unified diff on filtered files

I will write something up in python to give this a try.

@jcormier
Copy link
Author

Here is a first pass at implementing a dtdiff in python

https://gist.github.com/jcormier/4af3e1a33cfc4d20690350cfa9e134a1

@dgibson
Copy link
Owner

dgibson commented Sep 9, 2024

Here is a first pass at implementing a dtdiff in python

https://gist.github.com/jcormier/4af3e1a33cfc4d20690350cfa9e134a1

I don't think this is a wise approach. The basic idea of attempting to recognize phandle changes by the values is flawed, I think. It'll get it right most of the time, but it will be very confusing when it interprets some unrelated change as a phandle update.

More superficially this is using regexes to re-parse parts of the tree from source, which seems a very roundabout way of doing things when dtc can already parse and knows some of the mappings you're trying to determine internally. I'm not very convinced about the robustness of a bunch of those regexes either (phandles need not have exactly two hex digits, for one example).

In short, this seems like a lot of trouble and complexity to go to, to get a result that won't even be reliable.

@a3f
Copy link
Contributor

a3f commented Sep 9, 2024

FYI, #100 by @ukleinek was meant to be a first step towards improving dtdiff. By keeping around the information about labels, it's possible to decompile them and have dtdiff eventually be aware of them.

This doesn't help you however if the only thing you have is the compiled device trees that lack the fixup information.

@jcormier
Copy link
Author

FYI, #100 by @ukleinek was meant to be a first step towards improving dtdiff. By keeping around the information about labels, it's possible to decompile them and have dtdiff eventually be aware of them.

Are the fixups the same thing that gets enabled when you want overlays to work?

@jcormier
Copy link
Author

I don't think this is a wise approach. The basic idea of attempting to recognize phandle changes by the values is flawed, I think. It'll get it right most of the time, but it will be very confusing when it interprets some unrelated change as a phandle update.

First I don't think the tool I'm looking for needs to be perfect. I just want to find likely changes between two device trees for when things stop working between one version and another. And the old dtdiff easily prints out several thousand lines of diff. And even this quick hack, drops that down to a hundred.

There is a pretty low risk that a phandle changes from 0xa1 to 0xa5, and a completely unrelated value also changes from 0xa1 to 0xa5.

More superficially this is using regexes to re-parse parts of the tree from source, which seems a very roundabout way of doing things when dtc can already parse and knows some of the mappings you're trying to determine internally.

This is fair, however I'm not familiar with this code base or parsing code in general. My first pass through the code told me it would take me too long to figure out how to implement a phandle map, when this was just going to be a proof of concept. And making one in python was simple enough, I was going to do it using "awk" but that would have been even less readable.

Note that since dtc itself cannot determine what is a phandle or not since it is only ever fed one file at a time. This method requires looking at both dtb files together, to get the context of what is or isn't a phandle value. I'm pretty sure that's not a feature that would make sense to bake into dtc. But dtc could easily create some kind of phandle map file...

I'm not very convinced about the robustness of a bunch of those regexes either (phandles need not have exactly two hex digits, for one example).

Good catch. Hexvalues are always going to end with a space or non-alphanumeric character so no need to limit them to 1-byte

@dgibson
Copy link
Owner

dgibson commented Sep 17, 2024

FYI, #100 by @ukleinek was meant to be a first step towards improving dtdiff. By keeping around the information about labels, it's possible to decompile them and have dtdiff eventually be aware of them.

Are the fixups the same thing that gets enabled when you want overlays to work?

Basically, yes.

I don't think this is a wise approach. The basic idea of attempting to recognize phandle changes by the values is flawed, I think. It'll get it right most of the time, but it will be very confusing when it interprets some unrelated change as a phandle update.

First I don't think the tool I'm looking for needs to be perfect. I just want to find likely changes between two device trees for when things stop working between one version and another. And the old dtdiff easily prints out several thousand lines of diff. And even this quick hack, drops that down to a hundred.

Ok. For your own tool do whatever works for you, obviously. But I'm saying that approach would not be suitable for an "official" tool packaged as part of the dtc/libfdt tree.

There is a pretty low risk that a phandle changes from 0xa1 to 0xa5, and a completely unrelated value also changes from 0xa1 to 0xa5.

Pretty low, yes, but not so low I'm comfortable with it: I've seen a whole heap of unlikely coincidences like this cause very cryptic problems in my time. Also consider that while an unrelated change from 0xa1 to 0xa5 is unlikely, both a phandle and other change from 0x1 to 0x2 is rather more likely. Indeed phandle changes are probably not random in practice: a very likely change is that most of the phandles are shifted by one because of the insertion of an extra node. That means any unrelated change by 1 within the whole range of phandles will get a false positive.

More superficially this is using regexes to re-parse parts of the tree from source, which seems a very roundabout way of doing things when dtc can already parse and knows some of the mappings you're trying to determine internally.

This is fair, however I'm not familiar with this code base or parsing code in general. My first pass through the code told me it would take me too long to figure out how to implement a phandle map, when this was just going to be a proof of concept. And making one in python was simple enough, I was going to do it using "awk" but that would have been even less readable.

Right. As I think @a3f is obliquely suggesting you could use the -L option to force the generation of __local_fixups__ information which you could then parse more simply to tell you where the phandles are. This won't help you if your initial input is dtb (without fixups) rather than dts of course. Once it's a dtb, you don't know where the phandles are except by complicated understanding of all the bindings.

Note that since dtc itself cannot determine what is a phandle or not since it is only ever fed one file at a time. This method requires looking at both dtb files together, to get the context of what is or isn't a phandle value. I'm pretty sure that's not a feature that would make sense to bake into dtc. But dtc could easily create some kind of phandle map file...

That's kind of what __local_fixups__ and __fixups__ are.

I'm not very convinced about the robustness of a bunch of those regexes either (phandles need not have exactly two hex digits, for one example).

Good catch. Hexvalues are always going to end with a space or non-alphanumeric character so no need to limit them to 1-byte

@jcormier
Copy link
Author

Basically, yes.

I should have checked, --symbols is enabled to support overlays at least in the 6.1 kernel. So my device trees don't have fixups built in.

-@, --symbols              
        Enable generation of symbols

Ok. For your own tool do whatever works for you, obviously. But I'm saying that approach would not be suitable for an "official" tool packaged as part of the dtc/libfdt tree.

Fair enough. I still suspect it would be a useful tool for more than just me. Maybe this issue will serve as a starting place for someone else looking for this feature.

I suppose if the kernel outputted intermediate dts files with all the #includes handled, I wouldn't need to compare dtb files. Then all of this wouldn't be needed in most cases.

@dgibson
Copy link
Owner

dgibson commented Sep 20, 2024

Basically, yes.

I should have checked, --symbols is enabled to support overlays at least in the 6.1 kernel. So my device trees don't have fixups built in.

I don't think the second sentence follows from the first. But I guess your point is that the kernel uses -@ but not -L which would generate the fixups as well

-@, --symbols              
        Enable generation of symbols

Ok. For your own tool do whatever works for you, obviously. But I'm saying that approach would not be suitable for an "official" tool packaged as part of the dtc/libfdt tree.

Fair enough. I still suspect it would be a useful tool for more than just me. Maybe this issue will serve as a starting place for someone else looking for this feature.

I suppose if the kernel outputted intermediate dts files with all the #includes handled, I wouldn't need to compare dtb files. Then all of this wouldn't be needed in most cases.

It should be pretty easy to make it do that. Just find the rules in the makefile that invoke dtc to build the dtbs, and change it from -O dtb to -O dts.

@jcormier
Copy link
Author

I don't think the second sentence follows from the first. But I guess your point is that the kernel uses -@ but not -L which would generate the fixups as well

Hmm, I see a symbols struct in my dtb but I don't see any fixup related entries. So I assumed they weren't being added. In fact it doesn't look like the -L option was included in the kernel until 6.11

It should be pretty easy to make it do that. Just find the rules in the makefile that invoke dtc to build the dtbs, and change it from -O dtb to -O dts.

Okay. Good idea. I however cannot find a instance of '-O dtb' in the kernel source tree, that isn't linked to mips or powerpc. So it must be using some variables or similar to obscure it...

I did however find something surprising and potentially useful. A kernel supplied dtb/s diff tool

./scripts/dtc/dtx_diff -h

Usage:

   dtx_diff DTx
        decompile DTx

   dtx_diff DTx_1 DTx_2
        diff DTx_1 and DTx_2


      --annotate    synonym for -T
      --color       synonym for -c (requires diff with --color support)
       -c           enable colored output
       -f           print full dts in diff (--unified=99999)
       -h           synonym for --help
       -help        synonym for --help
      --help        print this message and exit
       -s SRCTREE   linux kernel source tree is at path SRCTREE
                        (default is current directory)
       -S           linux kernel source tree is at root of current git repo
       -T           annotate output .dts with input source file and line
                        (-T -T for more details)
       -u           unsorted, do not sort DTx


Each DTx is processed by the dtc compiler to produce a sorted dts source
file.  If DTx is a dts source file then it is pre-processed in the same
manner as done for the compile of the dts source file in the Linux kernel
build system ('#include' and '/include/' directives are processed).

This does handle diffing dts files with #includes. Unfortunately, it doesn't do anything to handle the phandle changes so it's just as painful to dig through the diff. How feasible would it be to have an option which allows -O dts, to leave the phandle labels in place? So syscon = <0x35>; would be syscon = <&wkup_conf>; instead.

        a53_opp_table: opp-table {
                compatible = "operating-points-v2-ti-cpu";
                opp-shared;
-               phandle = <0x49>;
-               syscon = <0x35>;
+               phandle = <0x52>;
+               syscon = <0x3d>;

@dgibson
Copy link
Owner

dgibson commented Sep 21, 2024 via email

@jcormier
Copy link
Author

You mean it was included in the options given to dtc after that? Or that at that point the embedded dtc version had support for the flag?

The embedded dtc got support for the flag. I didn't see any commits actually using it.

If coming from dts, not at all tricky - we already do this.

Yeah, this is the scenario I was thinking about. I understand the limitations with the dtb's, at least the ones discussed in this issue.

This does handle diffing dts files with #includes. Unfortunately, it doesn't do anything to handle the phandle changes so it's just as painful to dig through the diff. How feasible would it be to have an option which allows -O dts, to leave the phandle labels in place? So syscon = <0x35>; would be syscon = <&wkup_conf>; instead.

When running the dtx_diff tool on two dts files, it still outputted the phandles instead of labels. Is it a simple matter to keep it from doing that?

@dgibson
Copy link
Owner

dgibson commented Sep 29, 2024

You mean it was included in the options given to dtc after that? Or that at that point the embedded dtc version had support for the flag?

The embedded dtc got support for the flag. I didn't see any commits actually using it.

Ok.

If coming from dts, not at all tricky - we already do this.

Yeah, this is the scenario I was thinking about. I understand the limitations with the dtb's, at least the ones discussed in this issue.

This does handle diffing dts files with #includes. Unfortunately, it doesn't do anything to handle the phandle changes so it's just as painful to dig through the diff. How feasible would it be to have an option which allows -O dts, to leave the phandle labels in place? So syscon = <0x35>; would be syscon = <&wkup_conf>; instead.

When running the dtx_diff tool on two dts files, it still outputted the phandles instead of labels. Is it a simple matter to keep it from doing that?

I don't know, sorry. I had a brief look at dtx_diff and found it surprisingly complicated and hard to follow. It's also possible this is a question of the embedded dtc version again.

@jcormier
Copy link
Author

jcormier commented Oct 2, 2024

I don't know, sorry. I had a brief look at dtx_diff and found it surprisingly complicated and hard to follow. It's also possible this is a question of the embedded dtc version again.

Yeah no kidding.

How feasible would it be to have an option which allows -O dts, to leave the phandle labels in place?

Is this something the normal dtc binary does by default when you dtc -I dts -O dts?

@dgibson
Copy link
Owner

dgibson commented Oct 30, 2024

@ukleinek 's pull request #151 is likely to improve this, but there are a number of details to sort out before it's ready to merge.

@ukleinek
Copy link
Contributor

@jcormier If you have the time and motivation, please test with my PR #151 that I consider ready now (but please follow the feedback I get from @dgibson there for sure). I think it should improve your situation quite a bit.

@jcormier
Copy link
Author

@jcormier If you have the time and motivation, please test with my PR #151 that I consider ready now (but please follow the feedback I get from @dgibson there for sure). I think it should improve your situation quite a bit.

Testing against two dtbs produced by the kernel I still see lots of phandle based diffs. I do see labels for each node being recovered which is nice but I don't see references to those labels getting fixed. Does the dtb need to be built with the updated dtc? If so that probably won't help things until far in the future when the kernel adopts the changes.

Example:
Original dts

	tlv320aic26: audio-codec@1 {
		status = "okay";
		#sound-dai-cells = <0>;
		compatible = "ti,tlv320aic26";
		reg = <0x1>;

		/* Regulators */
		AVDD-supply = <&vcc_3v3_vio>;
		IOVDD-supply = <&vcc_3v3_vio>;
		DRVDD-supply = <&vcc_3v3_vio>;
		DVDD-supply = <&som_vdd_1v8>;

		reset-gpios = <&exp1 7 GPIO_ACTIVE_LOW>;
		spi-max-frequency = <1000000>;
		spi-cpha;
	};

Decompiled with changes from phandles branch. None of the supplies or the reset gpio got restored. Is that expected?

jcormier@jcormier-MS-7A93:~/build/dtc$ ./dtc -I dtb -O dts -qq -f -s -o - k3-am62x-mitysom-devkit.dtb | grep tlv320 -A8
                        tlv320aic26: audio-codec@1 {
                                #sound-dai-cells = <0x00>;
                                AVDD-supply = <0x2c>;
                                DRVDD-supply = <0x2c>;
                                DVDD-supply = <0x2d>;
                                IOVDD-supply = <0x2c>;
                                compatible = "ti,tlv320aic26";
                                phandle = <0x5f>;
                                reg = <0x01>;
                                reset-gpios = <0x28 0x07 0x01>;
                                spi-cpha;
                                spi-max-frequency = <0xf4240>;
                                status = "okay";
                        };

@ukleinek
Copy link
Contributor

Compile with -@ -L, otherwise the metadata isn't available to know which data chunks are phandles.

@jcormier
Copy link
Author

Compile with -@ -L, otherwise the metadata isn't available to know which data chunks are phandles.

Gotcha, at least the version of the kernel 6.6 I'm using only builds with -@ for overlay support. It doesn't build with -L so that's unfortunate.

@ukleinek
Copy link
Contributor

Also note that the kernel doesn't use the host dtc but ships its own copy. And you can inject additional parameters with the DTC_FLAGS env var.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants