2 Replies Latest reply: Sep 30, 2018 8:50 AM by Akbar Jiwani RSS

    Need help with Data Model

    Akbar Jiwani

      I am republishing this question with more details:

       

      I have following 3 tables

      MProgram

      ProgramCenterIDFeesVenueDate
      1300Venue120 sept 2018
      2400Venue 221 sept 2018
      3200Venue 325 sept 208
      4250Venue 425 sept 2018
      5100Venue 526 sept 2018

       

      MParticipant

      ParticipantIDFirst NameMiddle NameLast NameDOB
      1F1M1L124 Jan 1995
      2F2M2L223 Dec 1985
      3F3M3L313 Oct 1999
      4F4M4L414 Oct 1993
      5F5M5L523 Apr 2001
      6F6M6L624 Aug 2003
      7F7M7L724 Dec 2005
      8F8M8L825 Apr 2008
      9F9M9L926 Aug 2010
      10F10M10L1026 Dec 2012

       

      MProgramParticipant

      ProgramCenterIDParticipantID
      11
      12
      13
      21
      23
      25
      22
      33
      34
      35
      31
      41
      42
      43
      44
      52
      53
      54
      55

       

      When I upload the these tables to Qlik Sense, following associations are created:

      1. MProgram.ProgramCenterID <> MProgramParticipant.ProgramCenterID

      2. MParticipant.ParticipantID <> MProgramParticipant.ParticipantID

       

      When I display Count of ParticipantID, it displays 10. Where as I want to display count of participants who attended atleast 1 program. i.e. 5

      I also want to display Total number of Participants. i.e. simply non-distinct count of rows in table MProgramParticipant

      I can't figure out how to do this. Any thoughts?

        • Re: Need help with Data Model
          Petter Skjolden

          First of all doing any kind of aggregations on the key fields lead to ambiguous results. You can see this as Count(ParticipantID) gives 10 which is what you would expect whereas Count(ProgramCenterID) will give you 19 which is not intuitive.

           

          So to have some control of what you are aggregating you would need a set of helper columns which you can use to aggregate correctly.

           

          Add an extra field to each of the tables that are a non-key fields:

           

          You load script would look like this:

           

          [PROGRAM CENTER]:
          LOAD
              RowNo() AS PC#,
              ProgramCenterID,
              Fees,
              Venue,
              Date
          FROM [lib://DATA]
          (html, utf8, embedded labels, table is @1);
          
          
          [PARTICIPANT]:
          LOAD
              RowNo() AS P#,
              ParticipantID,
              [First Name],
              [Middle Name],
              [Last Name],
              DOB
          FROM [lib://DATA]
          (html, utf8, embedded labels, table is @2);
          
          
          [CENTER_PARTICIPANT]:
          LOAD
              RowNo() AS C_P#,
              ProgramCenterID,
              ParticipantID
          FROM [lib://DATA]
          (html, utf8, embedded labels, table is @3);
          
          

           

          The number of program participants that attend at least one program is:

           

          Count( {<P#=P(PC#)>} P#)

           

          Using the P() element function will make sure that there are any possible program centers associated with a participant.

           

           

          Count( C_P# )

           

          This is straightforward as this is a unique row id for the table CENTER_PARTICIPANT.