Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Bug in TIdAttachment::FileName ?
#1
Hi

I think I found a small bug in the TIdAttachment::FileName 

I have received a mail with an attachment that have a very stupid filename:
Fakturanr. 119180921 fra Lemvigh-Müller A/S.pdf

Indy only gives med the S.pdf part, the rest is discarded.

I know that the /  isn't allowed in a filename on windows and it was ok if indy replaced it, but it should not discard everything in front of it.

I know the real filename because Thunderbird shows all of it, and when I want to save it replaces the / with -.

Can I do anything to get the full filename ??

Thanks in advance
Best regards
Asger

The code I use is this:

Code:
bool pdfMail::getPdfAttachments(TIdMessageParts *parts, TMailContent* MailContent )
{
  parts->CountParts();
  if( parts->AttachmentCount < 1 )return false;

  int C = parts->Count;
  for( int i = 0; i < C; ++i )
  {
      TIdAttachment *atcm = dynamic_cast<TIdAttachment*>( parts->Items[i] );

      if( atcm && attachmentsIsPdf( atcm->FileName ) )// checks if the extension is pdf
      {
        MailContent->Attachments.Add( atcm );
      }
  }

  return MailContent->Attachments.Count() > 0;
}
Reply
#2
(08-27-2020, 05:06 PM)Asger-P Wrote: I have received a mail with an attachment that have a very stupid filename:
Fakturanr. 119180921 fra Lemvigh-Müller A/S.pdf

Indy only gives med the S.pdf part, the rest is discarded.

The parsed value of the TIdAttachment::FileName property comes from TIdMessageDecoderMIME::ReadHeader() (assuming a MIME-encoded email), which retrieves the filename from decoding the 'Content-Disposition' and 'Content-Type' headers (in that order).  Ultimately, the decoded data passes through TIdMessageDecoderMIME::RemoveInvalidCharsFromFilename() before being assigned to the FileName property, and that does indeed treat '/' as a path delimiter and will remove everything before and including the last '/'.  An attachment's filename is not supposed to contain path information.

(08-27-2020, 05:06 PM)Asger-P Wrote: Can I do anything to get the full filename ??

Without altering Indy's source code (I would be hesitant to do that without seeing an RFC spec that says this behavior is wrong), you could manually parse out the filename from the TIdAttachment::ContentDisposition and TIdAttachment::ContentType properties the same way Indy does, and just skip the final step that truncates the value, eg:

Code:
...
#include <IdGlobalProtocols.hpp>
#include <IdCoderHeader.hpp>

...

if( atcm /*&& attachmentsIsPdf( atcm->FileName )*/ )// checks if the extension is pdf
{
    String filename = ExtractHeaderSubItem(atcm->ContentDisposition, "filename", QuoteMIME);
    if( filename == "" )
        filename = ExtractHeaderSubItem(atcm->ContentType, "name", QuoteMIME);

    /* alternatively:
    String filename = atcm->Headers->Params["Content-Disposition"]["filename"];
    if( filename == "" )
        filename = atcm->Headers->Params["Content-Type"]["name"];
    */

    if( filename != "" )
    {
        filename = DecodeHeader(filename);
        //filename = RemoveInvalidCharsFromFilename(filename); // <-- skip this! Or mutate the filename however you want...
    }

    if( attachmentsIsPdf( filename ) ) // checks if the extension is pdf
        MailContent->Attachments.Add( atcm );
}

Reply
#3
Hi Remy 

Thank you very much for the code, I'll try that.

rlebeau wrote:
Without altering Indy's source code (I would be hesitant to do that without seeing an RFC spec that says this behavior is wrong),



I don't think you will find anywhere in the RFC that it says: "if you encounter a slash or a backslash you 
can safely assume that everything before that is illegal path information and delete it", either.  ;-) 

Best regards
Asger
Reply
#4
(08-28-2020, 10:44 AM)Asger-P Wrote: I don't think you will find anywhere in the RFC that it says: "if you encounter a slash or a backslash you 
can safely assume that everything before that is illegal path information and delete it", either.  ;-) 

RFC 2183 does say:

Quote:It is important that the receiving MUA not blindly use the suggested filename.  The suggested filename SHOULD be checked (and possibly changed) to see that it conforms to local filesystem conventions, does not overwrite an existing file, and does not present a security problem (see Security Considerations below).

The receiving MUA SHOULD NOT respect any directory path information that may seem to be present in the filename parameter.  The filename should be treated as a terminal component only.  Portable specification of directory paths might possibly be done in the future via a separate Content-Disposition parameter, but no provision is made for it in this draft.

Not quite the same thing, though, I suppose.  Looking at the current implementation of TIdMessageDecoderMIME.RemoveInvalidCharsFromFilename(), if I were to remove/disable the logic that truncates on the last '/' or '\' character, the remaining logic would convert them to '_' instead.  So, it may be worth considering for a future release.  Perhaps by making the behavior configurable at runtime via a global variable or something.

Reply
#5
Hi Remy

I can see there is room for the interpretation you made, but I do really believe it is my job to
check the filename against the system, after all you don't have any information about what
the filename should be used for anyway, I might just want to make a list of the attachments.

Replacing it with '_' is better for me, but i think it would be just as OK to leave it as is, so that
I have to check according to the rules on my system, as I do anyway.

Best regards
Asger
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)