Knowledge sharing related to Apache Ant, mainly ZipFile class…

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Knowledge sharing related to Apache Ant, mainly ZipFile class…

Roman Horváth
I wish you a good day,

this mail is mainly related to the Apache Ant ZipFile class. This mail shares some knowledge and makes some small recommendations for the ZipFile class and one recommendation for the ZipOutputStream class. (I separated them by few —.)

———

The first sentence within the file comments sounds:

        Replacement for java.util.ZipFile.

But in fact, it should sound:

        Replacement for java.util.zip.ZipFile.

(Similarly, a few lines later.)

———

In the constructor ZipFile(final File f, final String encoding, final boolean useUnicodeExtraFields) a tiny performance improvement may be made. In the block finally, the lines:

                        closed = !success;
                        if (!success) {

may sound:

                        closed = !success;
                        if (closed) {

It takes the same result. (The second “not” operation is not needed.)

———

I noticed that the ZipFile class got a new method – getName(). This method does not have a description. It may be (e.g.) simply (I guessed the version – this was the version when I noticed it):

        /**
         * Gets the archive name.
         *
         * @return current archive name.
         * @since 1.10.7
         */

Yet the list item in the beginning: “There is no getName method,” is no longer true.

———

I suggest adding a new feature to this class – the overall archive comment. (Because the class ZipOutputStream has the method setComment.)

It would take those few actions:

1. Add new attribute (a String comment) between existing ones, e.g. between the String archiveName and the RandomAccessFile archive:

        /**
         * File name of actual source.
         */
        private final String archiveName;

        /**
         * File (the overall archive) comment.
         */
        private String comment = null;

        /**
         * The actual data source.
         */
        private final RandomAccessFile archive;

2. Add the method getComment, e.g. after the method getName:

        public String getName() {
                return archiveName;
        }

        /**
         * Returns the file comment, or null (if there is no file comment
         * or the file comment has been ignored intentionally).
         * @see ZipOutputStream#setComment(String)
         */
        public String getComment() {
                return comment;
        }

3. Add new flag EOCD_SIG, e.g. after the flag CFH_SIG:

        private static final long CFH_SIG =
                ZipLong.getValue(ZipOutputStream.CFH_SIG); // 0x02014B50L

        private static final long EOCD_SIG = 0x06054B50L;

4. Add the following code to the positionAtCentralDirectory method:

        private void positionAtCentralDirectory() throws IOException
        {
                positionAtEndOfCentralDirectoryRecord(); // Existing line


                // Read the file comment
                long archivePos = archive.getFilePointer();
                try
                {
                        if (archivePos + 22 <= archive.length())
                        {
                                // Read the comment length only:
                                archive.skipBytes(20);
                                archive.readFully(SHORT_BUF);
                                int commentLen = ZipShort.getValue(SHORT_BUF);

                                if (0 != commentLen)
                                {
                                        // Cropping:
                                        if (archive.getFilePointer() + commentLen >=
                                                archive.length()) commentLen = (int)(archive.length()
                                                        - archive.getFilePointer());

                                        // Reading and decoding the file comment:
                                        final byte[] comment = new byte[commentLen];
                                        archive.readFully(comment);
                                        this.comment = zipEncoding.decode(comment);
                                }
                        }
                }
                catch (Throwable t)
                {
                        // Swallow, if anything goes wrong, the comment just stays as is.
                }
                finally
                {
                        // Return to the last active position:
                        archive.seek(archivePos);
                }


                boolean found = false; // Existing line


It took me a few days. I took the inspiration from several public sources. The code should work properly, but I recommend taking several tests (using test zip files in different conditions, which I don’t have) to prove it flawless. (I did it several weeks ago, and I did some basic testing, but to be sure I recommend making more tests.)

———

In the class ZipOutputStream, in the method setLevel(int level); a minor refactoring can be made. On the last few lines of code:

                if (this.level == level) {
                        return;
                }
                hasCompressionLevelChanged = true;
                this.level = level;

the return statement will become pointless after rewriting the code in a more effective way (negating the condition):

                if (this.level != level) {
                        hasCompressionLevelChanged = true;
                        this.level = level;
                }

(According to https://stackoverflow.com/a/23354212/2036423 this shouldn’t be any problem.)

———

Best regards,
Roman Horváth


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Knowledge sharing related to Apache Ant, mainly ZipFile class…

Stefan Bodewig
Hello Roman

many thanks for your comments and your interest in Ant's zip support.

Ant's zip package has been the inspiration and nucleus for the zip
package in Apache Commons Compress and the code has become more evolved
and more versatile over there:

    https://commons.apache.org/proper/commons-compress/

Every now and then we've backported bugfixes from Commons Compress but
not additional features. The goal has been to keep the zip package
inside of Ant focussed on what is actually used by Ant's tasks. That the
classes go beyond that scope is mostly due to backwards
compatibility. Therefore I wouldn't want to add comment support to
ZipFile in Ant (as nobody would use it) but could see it being added in
Commons Compress.

In order to make backporting easier we try to not change the code
structure here without changing it on Commons Compress as well - so that
applying patches later becomes easier.

On 2020-06-16, Roman Horváth wrote:

> The first sentence within the file comments sounds:

> Replacement for java.util.ZipFile.

> But in fact, it should sound:

> Replacement for java.util.zip.ZipFile.

> (Similarly, a few lines later.)

True.

> In the constructor ZipFile(final File f, final String encoding, final boolean useUnicodeExtraFields) a tiny performance improvement may be made.

Here and in the case of ZipOutputStream#setLevel I'd assume the JIT is
going to eat up the performance gain quickly. Given the constructor (and
setLevel) are likely to be called at most once it will be dwarfed by the
time spent on (de)compression anyway. In this case I'd go for readablity
over performance. This doesn't mean

> closed = !success;
> if (!success) {

was more or less readable then

> closed = !success;
> if (closed) {

in general. But "do the if part if opening the archive has not been
successful" conveys the intent of the code better than "do the if part
if we've set the closed flag".

> I noticed that the ZipFile class got a new method – getName(). This method does not have a description. It may be (e.g.) simply (I guessed the version – this was the version when I noticed it):

Well, that somehow slipped under the radar. I don't think it is used by
anybody right now. But yes, the docs should be updated.

> I suggest adding a new feature to this class – the overall archive comment. (Because the class ZipOutputStream has the method setComment.)

See above. I agree the feature would be useful in Commons Conpress. In
Ant not so much.

Thanks

        Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Knowledge sharing related to Apache Ant, mainly ZipFile class…

Roman Horváth
Hello, Stefan,

thank you, I understand the circumstances and agree – Ant’s zip should stay simple whereas there is a “superstructure”… I think that I should explore the commons compress project. It’s possible that it would fit my needs better.

Thanks again… :-)

Roman

----- Pôvodná správa -----
Od: "Stefan Bodewig" <[hidden email]>
Komu: [hidden email]
Odoslané: Pondelok, 22. Jún 2020 10:41:12
Predmet: Re: Knowledge sharing related to Apache Ant, mainly ZipFile class…

Hello Roman

many thanks for your comments and your interest in Ant's zip support.

Ant's zip package has been the inspiration and nucleus for the zip
package in Apache Commons Compress and the code has become more evolved
and more versatile over there:

    https://commons.apache.org/proper/commons-compress/

Every now and then we've backported bugfixes from Commons Compress but
not additional features. The goal has been to keep the zip package
inside of Ant focussed on what is actually used by Ant's tasks. That the
classes go beyond that scope is mostly due to backwards
compatibility. Therefore I wouldn't want to add comment support to
ZipFile in Ant (as nobody would use it) but could see it being added in
Commons Compress.

In order to make backporting easier we try to not change the code
structure here without changing it on Commons Compress as well - so that
applying patches later becomes easier.

On 2020-06-16, Roman Horváth wrote:

> The first sentence within the file comments sounds:

> Replacement for java.util.ZipFile.

> But in fact, it should sound:

> Replacement for java.util.zip.ZipFile.

> (Similarly, a few lines later.)

True.

> In the constructor ZipFile(final File f, final String encoding, final boolean useUnicodeExtraFields) a tiny performance improvement may be made.

Here and in the case of ZipOutputStream#setLevel I'd assume the JIT is
going to eat up the performance gain quickly. Given the constructor (and
setLevel) are likely to be called at most once it will be dwarfed by the
time spent on (de)compression anyway. In this case I'd go for readablity
over performance. This doesn't mean

> closed = !success;
> if (!success) {

was more or less readable then

> closed = !success;
> if (closed) {

in general. But "do the if part if opening the archive has not been
successful" conveys the intent of the code better than "do the if part
if we've set the closed flag".

> I noticed that the ZipFile class got a new method – getName(). This method does not have a description. It may be (e.g.) simply (I guessed the version – this was the version when I noticed it):

Well, that somehow slipped under the radar. I don't think it is used by
anybody right now. But yes, the docs should be updated.

> I suggest adding a new feature to this class – the overall archive comment. (Because the class ZipOutputStream has the method setComment.)

See above. I agree the feature would be useful in Commons Conpress. In
Ant not so much.

Thanks

        Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]