-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propose of new structure and associated scripts #1
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Jirka, thank you for your proposed change!
Global comments:
- Dependency tracking is unfortunately broken after this patch, so even if a test or its config has not changed, it will still re-run tests.
- I think it is better to have an "unversioned" test too (to track the development branch) and only add an override for specific versions if these turn out to be incompatible. You will probably need at least a new PDML file, if a new XSL file is needed, we can revisit that later. (The original version where the test was tried with can be added in a commit messary if necessary.)
- Original Makefile was minimalistic and intentionally lacked "features" that are not used. What about removing the "no_pdml" and "no_text" options? Let this be implicit from missing PDML/text files?
Do we really need to test the text output if we already have PDML output? Granted, the Protocol and Info column are not tested, but is that something that is likely to cause issues that are not detected from the PDML output?
Perhaps tshark
also needs to be run with HOME=$(mktemp -d)
or something, it seems that your preferences contains additional columns that are not present in the default config.
Tentative verdict: it looks more complex now, I would prefer something more readable than Bash (Python?) if more logic is needed, but acknowledge that the "do-ers" will move this project forward :-)
scripts/sample_make_output.sh
Outdated
TYPE="$3" | ||
REQ_VERSION="$4" | ||
|
||
${TSHARK_EXECUTABLE} --version > /dev/null 2> /dev/null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be quoted like "${TSHARK_EXECUTABLE}"
(same below)?
scripts/sample_make_output.sh
Outdated
fi | ||
|
||
if [ ! -f "${OUTPUT_FILE}" -o ${FILE_PCAP} -nt ${OUTPUT_FILE} ]; then | ||
"${TSHARK_EXECUTABLE}" $TSHARK_ARGS -T ${XTYPE} ${XARGS} -r "${FILE_PCAP}" > "${OUTPUT_FILE}".tmp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed opportunity for parallelism here I think, now single-pass cannot run together with -2
Makefile
Outdated
%.text.current: | ||
@./scripts/sample_test.sh "$(TSHARK_EXECUTABLE)" "$(basename $(basename $@))" text $(SELECTED_VERSIONS) | ||
|
||
all: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the default target should still be "test" and that this information belongs in the README (or make help
if you still want a summary).
-------------------------------------- | ||
|
||
SUPPORTED_VERSIONS - list of versions checked during make or make outputs, when not specified, default in Makefile is used | ||
VERSION - version used for make or make outputs, when not specified, tshark version is used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe WS_VERSION (to make clearer that this is not some other version)?
if [ -f "${FILE}.pcap.gz" ]; then | ||
FILE_PCAP="${FILE}.pcap.gz" | ||
elif [ -f "${FILE}.pcapng.gz" ]; then | ||
FILE_PCAP="${FILE}.pcapng.gz" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are capture formats other than (compressed) pcap like android logcat, etc. What do you think about using a single extension (like FOO.pcap or FOO.cap) even if it is compressed? Otherwise we might have a lot of files here.
Alternatively, we can stick to the original convention of looking for FOO given FOO.pdml (e.g. dns.pcapng.pdml)
Dear Peter,
Global comments:
* Dependency tracking is unfortunately broken after this patch, so
even if a test or its config has not changed, it will still re-run
tests.
Which target do you run?
I tested it with 'make test' and it runs tests only once with one
exception - when test set is not complete. E.g. when .pdml or .text
files are missing, test is run during every run.
* I think it is better to have an "unversioned" test too (to track the
development branch) and only add an override for specific versions
if these turn out to be incompatible. You will probably need at
least a new PDML file, if a new XSL file is needed, we can revisit
that later. (The original version where the test was tried with can
be added in a commit messary if necessary.)
It's another point of view - my idea was to be sure that we know from
which version every output file came from.
Let's imagine that unversioned file is there for years - we have no
evidence which version generated it. As consequence of it, you can't
override it/include version, because you don't know it.
You might say it is not important whether version was 2.0 or 2.2 when
output is same which is true in general.
From another point of view - when you want to write automated
tests/regression tests, your code must check whether there is specific
version or older file or generic file. It looks to me complicated.
But we can discuss it later.
* Original Makefile was minimalistic and intentionally lacked
"features" that are not used. What about removing the "no_pdml" and
"no_text" options? Let this be implicit from missing PDML/text files?
My idea is to be sure that every sample is complete before putting it to
repository (verify_repository target). I think that missing .pdml or
.text is exceptional - it will be probably used for samples destined for
testing GUI by a human. Therefore when there is no pdml or text file, it
is error.
BTW I have set of GUI related files ready.
From another point of view - we can require pdml and text file every
time, even sample is destined for other purpose.
Do we really need to test the text output if we already have PDML
output? Granted, the Protocol and Info column are not tested, but is
that something that is likely to cause issues that are not detected from
the PDML output?
I created it as output of my experience that PDML output contained
something different than TEXT file. It was caused by error in code.
I know it is not possible to deduce how TEXT file should looks like
based on PDML file. But I found error when I compared PDML and TEXT
files for different versions - PDML changed because of new features, but
TEXT didn't even it should.
I think that ability of such compares is important for developers.
Perhaps |tshark| also needs to be run with |HOME=$(mktemp -d)| or
something, it seems that your preferences contains additional columns
that are not present in the default config.
Interesting, I was not aware of it. But you are right, some kind of
wireshark configuration should be part of sample.
Do you have any advices/ideas what should be stored?
Tentative verdict: it looks more complex now, I would prefer something
more readable than Bash (Python?) if more logic is needed, but
acknowledge that the "do-ers" will move this project forward :-)
I think so. Makefile (language of Makefile) is not enough to write
scripts therefore I selected bash. But when Python is better, lets
change it.
It would be good idea that we will use one of languages required by
Wireshark for compilation - I expect that one who will run happy-shark
scripts is developer or wireshark. Others will just use .pcaps from
repository for other purposes.
I'm working on Linux, but I don't know which tools must be part of
environment on Windows. Do you know it? Is Python in this list? If so,
which version?
------------------------------------------------------------------------
In scripts/sample_make_output.sh
<#1 (comment)>:
> @@ -0,0 +1,91 @@
+#!/bin/bash
+
+TSHARK_EXECUTABLE="$1"
+SAMPLE_DIR="$2"
+TYPE="$3"
+REQ_VERSION="$4"
+
+${TSHARK_EXECUTABLE} --version > /dev/null 2> /dev/null
Shouldn't this be quoted like |"${TSHARK_EXECUTABLE}"| (same below)?
Sure, I will fix it if we will continue with bash.
------------------------------------------------------------------------
In scripts/sample_make_output.sh
<#1 (comment)>:
> +fi
+
+OUTPUT_FILE="${FILE}_${TSHARK_VERSION}.${TYPE}"
+
+XTYPE=${TYPE}
+XARGS=
+if [ "${TYPE}" == "pdml1" ]; then
+ XTYPE=pdml
+ XARGS=
+elif [ "${TYPE}" == "pdml2" ]; then
+ XTYPE=pdml
+ XARGS=-2
+fi
+
+if [ ! -f "${OUTPUT_FILE}" -o ${FILE_PCAP} -nt ${OUTPUT_FILE} ]; then
+ "${TSHARK_EXECUTABLE}" $TSHARK_ARGS -T ${XTYPE} ${XARGS} -r "${FILE_PCAP}" > "${OUTPUT_FILE}".tmp
Missed opportunity for parallelism here I think, now single-pass cannot
run together with -2
You mean to generate .pdml1 and .pdml2 in parallel?
I was working on parallel variant. But I found that it is hard to read
error messages if any. But it can be changed...
------------------------------------------------------------------------
In Makefile
<#1 (comment)>:
>
TSHARK_EXECUTABLE?=tshark
+TSHARK_VERSION=$(shell $(TSHARK_EXECUTABLE) --version | head -1 | cut -d' ' -f 3 | cut -d'.' -f1,2)
+
+%.pdml1.current:
+ @./scripts/sample_test.sh "$(TSHARK_EXECUTABLE)" "$(basename $(basename $@))" pdml1 $(SELECTED_VERSIONS)
+
+%.pdml2.current:
+ @./scripts/sample_test.sh "$(TSHARK_EXECUTABLE)" "$(basename $(basename $@))" pdml2 $(SELECTED_VERSIONS)
+
+%.text.current:
+ @./scripts/sample_test.sh "$(TSHARK_EXECUTABLE)" "$(basename $(basename $@))" text $(SELECTED_VERSIONS)
+
+all:
I think that the default target should still be "test" and that this
information belongs in the README (or |make help| if you still want a
summary).
OK.
------------------------------------------------------------------------
In README.md
<#1 (comment)>:
> When proposing a new test, please include the source of the packet capture file
in the commit message. The source could be a link to https://bugs.wireshark.org/
or https://wiki.wireshark.org/SampleCaptures for example. Try to keep capture
files small and specific to a small number of protocols.
+Options and variables to run framework
+--------------------------------------
+
+SUPPORTED_VERSIONS - list of versions checked during make or make outputs, when not specified, default in Makefile is used
+VERSION - version used for make or make outputs, when not specified, tshark version is used
maybe WS_VERSION (to make clearer that this is not some other version)?
Yes, good point.
------------------------------------------------------------------------
In scripts/sample_make_output.sh
<#1 (comment)>:
> +FILE=`basename "${SAMPLE_DIR}"`
+
+TSHARK_VERSION=`${TSHARK_EXECUTABLE} --version | head -1 | cut -d' ' -f 3 | cut -d'.' -f1,2`
+if [ -n "${REQ_VERSION}" ]; then
+ if [ "${REQ_VERSION}" != "${TSHARK_VERSION}" ]; then
+ echo " FAILED, required tshark version do not match running version"
+ exit 1
+ fi
+fi
+
+cd "${SAMPLE_DIR}"
+
+if [ -f "${FILE}.pcap.gz" ]; then
+ FILE_PCAP="${FILE}.pcap.gz"
+elif [ -f "${FILE}.pcapng.gz" ]; then
+ FILE_PCAP="${FILE}.pcapng.gz"
There are capture formats other than (compressed) pcap like android
logcat, etc. What do you think about using a single extension (like
FOO.pcap or FOO.cap) even if it is compressed? Otherwise we might have a
lot of files here.
Alternatively, we can stick to the original convention of looking for
FOO given FOO.pdml (e.g. dns.pcapng.pdml)
There are many points to view/to discuss:
It is clear that there must be only one source file per directory. I
prefer to store files in compressed form - I expect we might store
longer samples in future and it is better to see it clearly.
I don't think we must limit format of sample, but the script code must
be able to recognize it - therefore we must write a list of possible
extensions to scripts and to documentation.
On the other hand - to simplify things - can't be everything converted
to one format, e g. pcapng?
I really don't know.
I don't think that use one extension to all formats is good idea.
Wireshark might guess format of file by content of file. But if you
would like to use other tool than Wireshark to analyse the file, you
will be confused when .pcap file has format of .logcat...
Unix 'file' command might help you, but is not easily available on Windows.
Are all formats supported with "a standard" build of wireshark? It would
be better to use more common formats.
When we will store file in original format, we might allow developers to
test reading of such format - e.g. reading of logcat produce different
output than reading of pcap.
But I'm not sure whether it is the point to care about.
…---------------
Thank you for reply, I think we should agree on language/tool of script
first...
Sincerely yours,
Jirka Novak
|
Hello Peter,
Hey Jirka, thank you for your proposed change!
I didn't responded for long time... Is happy-shark alive? Are there any
new activities regarding wireshark library?
I found I would like to push more samples to library therefore I can
focus on your comments...
Global comments:
* Dependency tracking is unfortunately broken after this patch, so
even if a test or its config has not changed, it will still re-run
tests.
* I think it is better to have an "unversioned" test too (to track the
development branch) and only add an override for specific versions
if these turn out to be incompatible. You will probably need at
least a new PDML file, if a new XSL file is needed, we can revisit
that later. (The original version where the test was tried with can
be added in a commit messary if necessary.)
* Original Makefile was minimalistic and intentionally lacked
"features" that are not used. What about removing the "no_pdml" and
"no_text" options? Let this be implicit from missing PDML/text files?
Do we really need to test the text output if we already have PDML
output? Granted, the Protocol and Info column are not tested, but is
that something that is likely to cause issues that are not detected from
the PDML output?
Perhaps |tshark| also needs to be run with |HOME=$(mktemp -d)| or
something, it seems that your preferences contains additional columns
that are not present in the default config.
Tentative verdict: it looks more complex now, I would prefer something
more readable than Bash (Python?) if more logic is needed, but
acknowledge that the "do-ers" will move this project forward :-)
Best regards,
Jirka Novak
|
Hi Jirka, Sorry for the long delay, there is indeed not much development here. Testing is still much needed, but the exact approach is still not set in stone. I guess that only by experimentation (like the original happy-shark), we can learn what works best. Feel free to create new samples as needed, and note that no activity from my side does not mean that I disapprove of your approach, but just lack of time to carefully look at it. Edit: should you have any questions or comments, feel free to drop a comment though! |
Hi Peter,
* Dependency tracking is unfortunately broken after this patch, so
even if a test or its config has not changed, it will still re-run
tests.
I fixed it. I learnt a lot of about make :-)
* I think it is better to have an "unversioned" test too (to track the
development branch) and only add an override for specific versions
if these turn out to be incompatible. You will probably need at
least a new PDML file, if a new XSL file is needed, we can revisit
that later. (The original version where the test was tried with can
be added in a commit messary if necessary.)
It is good point about unversioned test, but I don't think it is
usefull. The question is how to use happy-shark? I think there are two ways:
1) repository of samples
2) repository for UI testers
3) regression testing
1) and 2) is not related to any automated tests. If you use happyshark
for 2) (as I do), you probably work this way:
a) make changes in wireshark code
b) checkout happyshark
c) run tests with new wireshark
d) store your output permament
For a-c you don't need explicit unversioned output - it is made by test
procedure as .current file and it is automatically compared to output of
latest release in happyshark.
For d) you need to make output, but I believe it should be with version
number - to know it for future.
Another point of view:
If you are just "a developer" (I mean NOT maintainer, just anyone who is
not responsible for public release cycle), you probably will discard
your changes in happyshark once you commit your change or you will keep
changes in happyshark local.
If you are maintainer, you will do regress testing, make new outputs
with latest package and then commit it to happyshark for others.
Another question is whether two level release number is enough? My idea
was that outputs will be updated with each release in branch. Therefore
<file>_2.4 will contain output from 2.4.0, later from 2.4.1, 2.4.2, ...
Therefore it will be possible to compare your change with latest public
release of a branch.
On the other hand we will loose changes/progress in branch. We will just
see progress between branches.
* Original Makefile was minimalistic and intentionally lacked
"features" that are not used. What about removing the "no_pdml" and
"no_text" options? Let this be implicit from missing PDML/text files?
My opinion is that it is not good approach. I imagine that some samples
would be focused on UI only. Therefore there is no need to store
outputs. If we allow pdml/text files optional, it disallow code to
verify whether you are committing all required information with new test
case.
You are right it makes no sense to have no_pdml and no_text separated. I
think that no_output satisfy my usecase. What do you think about?
Do we really need to test the text output if we already have PDML
output? Granted, the Protocol and Info column are not tested, but is
that something that is likely to cause issues that are not detected from
the PDML output?
As I wrote ago, I touched this bug - text info shown something different
than pdml. I don't think that storing additional test file costs so much...
Perhaps |tshark| also needs to be run with |HOME=$(mktemp -d)| or
something, it seems that your preferences contains additional columns
that are not present in the default config.
Thank you for finding it, you was right - I use custom columns and I
missed it involves output.
I use /tmp as mktemp creates new directory which I'm not able to remove
it automatically for every case in Makefile. Do you think it is OK?
Tentative verdict: it looks more complex now, I would prefer something
more readable than Bash (Python?) if more logic is needed, but
acknowledge that the "do-ers" will move this project forward :-)
There are things I'm not able to write with Makefile (it might be
possible to do it, but I don't know how). As Makefile calls shell/bash,
I moved such steps to shell scripts to do not add additional dependencies.
I can rewrite it to python or perl, but I'm not sure whether it makes sense.
I think it makes sense to follow requirements for build environment for
wireshark. But I do not have such environment e.g. for Windows nor
MACOS. It would be welcomed if someone else test scripts in such
environments.
Best regards,
Jirka Novak
|
The general idea of proposed change is to add background information about the sample - publisher should describe what sample contains and why the sample is added to repository (e.g. unusual packet flow or headers, ...). There should be described requirements how the sample should be decoded - e.g. new header should be decoded in compare to previous Wireshark release.
For regression tests, pdml and txt output is stored and it is known which release of Wireshark produced it (only main.minor number is stored).
Scripts are now able to: